Stochastic Calculus and Mathematical Finance 



2012 



P. Ouwehand 



Department of Mathematical Sciences 
Stellenbosch University 



Contents 



II Brownian MotionI 

I. 1 Modelling Stock Returns 

II. 2 Definition and Existence of Brownian MotionI 

I. 3 Characteristic Functions and Independence . 

II. 4 The Multi-Normal Distribution! 

11.5 Gaussian Processes! 

I. 6 Some Useful and Interesting Properties . . . . 

II. 7 ExercisesI 



2 Martingale Theory 

2.1 Elements of Discrete-Time Martingales 

12.1.1 Stochastic Processes and FiltrationsI 

2.1.2 Martingales, Submartingales, Supermartingales 

2.1.3 Games and Strategies 

2.2 Stopping Times and Optional Stopping] 

2.3 The Martingale Convergence Theorem 

2.4 Uniformly Integrable Martingales 

2.4.1 Uniform Integrability 

2.4.2 UI Martingalcsl 

|2.4.3 Optional Sampling of UI Supermartingales . . 

2.5 Upwards and Downwards] 

2.6 Martingale Inequalities 

2.7 Continuous-Parameter Martingales 

2.7.1 Stopping Times 

2.7.2 Martingales in Continuous Time 

2.7.3 Optional Sampling 

2.7.4 Spaces of Martingales 

2.7.5 Local Martingales 



3 Prelude to Stochastic Integration 



3.1 The Riemann-Stieltjes Integral: Motivation and Definition 
13.2 Functions of Finite Variationl 



3.3 The Lebesgue-Stieltjes Integral 




3.4 Stieltjes Integration of Stochastic Processes 




3.5 Quadratic Variation and Covariation 





i 



4 Outline of Stochastic Integration 

4.1 L^-Theory of the Stochastic Integral 

4.1.1 Basic Integrands 

4.1.2 Simple Integrands] 

4.1.3 The space L^(M)] 

4.1.4 The Stochastic Integral on L^(M) 

4.1.5 An Example 

4.1.6 Approximation 

4.1.7 Quadratic Variation and Covariation of Stochastic Integrals 

4.1.8 The Associative Lawl 

4.2 Integration w.r.t. Semimartingales 

4.2.1 Continuous Local Martingales 

4.2.2 Continuous Semimartingales 

4.2.3 Approximation 

14.3 The Ito Formulal 

14.4 Differential NotationI 



5 Girsanov's Theorem and the Martingale Representation Theorem 

5.1 Changes of Measure and Girsanov's Theorem 

5.1.1 Characteristic Functions and Stochastic Exponentials] 

5.1.2 Changes of Measure] 

5.2 The Martingale Representation Theorem 

15.2.1 Motivation! 

15.2.2 Statement of Main Resultsl 

5.2.3 Preliminary Technicalities 

15.2.4 Proof of Main R^iHItsl 

16 SDEs and PDEsI 

6.1 SDEs: Existence and Uniqueness of Solutions 

16.2 The Linear SDEI 

6.2.1 The Linear SDE with additive noise only 



6.2.2 The Homogeneous Linear SDE with Multiplicative Noise 
16.2.3 The General Linear SDEl 



6.3 


Solving 


I PDEs Probabilistically 




6.3.1 


Solving the Heat Equation Probabilistically 




6.3.2 


The Black-Scholes PDE: A Heuristic Approach 




6.3.3 


The Feynman-Kac Theorem 



7 Financial Modelling in Continuous Time 


7.1 Stochastic Financial Modelling 





17.1.1 Basic NotionsI 



7.1.2 Martingale Pricing 

17.2 The Generalized Black-Scholes Modell 



17.2.1 Construction of a Risk-Neutral Measure via Girsanov's Theoremi . . 



7.2.2 No-Arbitrage and the Existence of a Risk-Neutral Measure 


7.2.3 Hedging of European Contingent Claims 






7.2.4 European Vanilla Call and Put Options 





Contents 



111 



7.2.5 Caveat: Arbitrage in the Black-Scholes Model 137 

17.3 The Black-Scholes PDEI 139 

17.4 Correlated Brownian MotionsI 141 



7.5 Change of Numeraire 146 



7.5.1 Introduction to Change of Numeraire 146 



7.5.2 Mechanics of Changes of Numeraire 147 



7.5.3 A General Option pricing Formula 150 



7.5.4 Applications 150 



8 Interest Rate Modelling 



155 

.1 Modelling Fixed Income: Introduction 155 



18.1.1 Classification of Interest Rate Modelsl 155 

18.1.2 Bond Market Baiicsl 156 



8.1.3 Modelhng the Bond Market 160 



8.2 Modehing the Short Rate 166 



18.2.1 The Term Structure PDEI 166 



8.2.2 Martingale Models of the Short Rate 169 



18.2.3 (Common Short Rate Modelsl 170 

18.2.4 Term Structure Derivatives! 171 



8.2.5 Lognormal Models 174 



18.3 Affine Term Structure Modelsl 174 

18.3.1 Mechanics of ATS modelsl 174 



8.3.2 Bond Options 182 



18.4 The Hcath-.T arrow-Morton Frameworkl 183 

8.4.1 The Set-Up| 183 



8.4.2 Martingale Modelhng 186 



8.4.3 Examples and Applications 187 



18.5 Market Models: PreliminariesI 189 

18.5.1 Black's Modelsl 189 



8.5.2 Review of Changes of Measure and Numeraire; LIBOR Rates 196 



O Lognormal Forward LIBOR Market Modelsl 199 

200 



8.6.1 The Brace-Gatarek-Musiela Approach to Forward LIBOR 



8.6.2 The Musiela-Rutkowski Approach to Forward LIBOR 203 



8.6.3 Jamshidian's Approach to Forward LIBOR 206 

209 



18.7 ExercisesI 



Chapter 1 

Brownian Motion 



1.1 Modelling Stock Returns 

Any model of stock price behaviour must be stochastic, i.e. it must incorporate the random 
nature of price behaviour. The simplest such models are random walks: Let Xt,t = 1,2,... 
be a family of distributed random variables, and let Sq be the stock price at f = 0. We might 
(naively) attempt to model the stock price process by 

t 

St = St-i + Xt i.e. St = Sp + y^^Xu 

u=l 

The intuition behind this is that the price at time t equals the price at time t — 1 plus a 
"random shock", modelled by Xt. 

We should also assume that these shocks are independent. Why? If we could predict 
today that the stock price is going to go up tomorrow, this makes the stock more attractive 
today. Thus more people would buy it today, forcing the stock price up today, until it reaches 
the level predicted. Thus any change in the stock price must essentially be unpredictable. 
This is just a version of Efficient Markets Hypothesis, which, loosely, asserts that all available 
information about a corporation is instantly reflected in its stock price. Thus future changes 
in price are not dependent on past changes in price. 

There arc several reasons why a random walk model of stock prices is inadequate, but an 
obvious one is that it doesn't take into account scale, for stock prices, we expect the change 
in price to be proportional to the current price. To see this, consider two companies in two 
parallel universes, A and B. The universes and the companies are identical, except for one 
thing. In universe A, the company has issued 100 shares, each trading at $100. In universe 
B, the company has undertaken a 2-for-l stock split, so that it has issued 200 shares, each 
trading at $50. Both companies are otherwise identical, e.g. they are both worth $10 000. 
One day an earthquake cause massive damage, and both companies lose half their value. The 
shares in universe A now trade at $50, whereas those in universe B trade at $25. Thus the 
share price has not dropped by the same amount in both universes: Each share has lost the 
same proportion of its value. 

Simply put, if investors require a return of 14%, then they require that return irrespective 
of whether the share price is $50 or $100. 

The shares of A, B change by the same factor, i.e. they have exactly the same change in 
returns (but not the same absolute change in price). This is reflected in, e.g., the binomial 



1 



2 



Modelling Returns 



model, where shares can go up by a factor of u or down by a factor of K But a multiphcative 
change in the stock price amounts to an additive change in the logarithm of the stock price: 

St+At = u'^^St implies In St+At = In ± In u 

i.e. if we define the returns process Rt by St = S^e^* (i.e. i?t := In and define 5 := Inu, 
we have i.e. 

Rt+At = Rt^S 

A better random model of stock prices is therefore one in which the returns process Rt 
follows a random walk. 

We now seek a continuous-time version of the random walk — a stochastic process that is 
changing because of random shocks at every instant in time. Consider a time interval [0, T] 
and let iV be a (large) integer. Define At := ^. Let Xn,n = 1,2,3,... be independent 
Bernoulli random variables with 

¥{Xn = Ax) = p and P(X„ = -Ax) = l-p=:q 

where Ax > 0. For t = 0, At, 2At, NAt = T, let Rt := J2k=i ^k, where t = nAt. Thus 
Rt is a random walk, and 

Rt+At = Rt±Ax 

Some simple calculations yield 

E[Rt] = nip - q)Ax = {p - q)^t \&x{Rt) = n{Ax'^ -{p- qfAx"^) = Apq- 



At V ^/ V -^z / 

Now suppose we can observe the process Rt and want K[Rt] = /xt and Vai (Rt) = a^t, where 
/X, a are constants, and a > 0. (We want > 0, otherwise Var(i?t) = 0, in which case Rt 
would be non-random.) 

In the continuous limit, i.e. as — >^ oo and At 0, we must have 

. .Ax ^ Ax^ 2 

The first equation yield Ax ^ when At is small. Substituting into the second equation, 
we see that 

{p - qY ~ 11^ 

when At is small. Now since. At 0, we must have (^^p — > oo, for otherwise the product 
j^^^^At would tend to 0, not It is therefore necessary that p — g ^ 0, and thus p, q must 
both tend to g as At — >■ 0. Prom the fact that ^pq^^ —?- a"^, we then see that we must have 

Ax ~ ay/At 



for small At. 

We had Ax ^ for small At, and thus p — q'^ Since p + g = 1, we must have 

p=i(l + ^VA^) g=i(l-MVAi) 



Brownian Motion 



3 



As a check, note that 



and 

Var(i?t) = ^pq^t = (1 - !^At)^^t = aH - fiHAt ^ aH 
as should be the case. 

We now have an idea of how to create a a continuous-time stochastic process Rt as the 
(At — >■ 0)-hmit of a random walk. But the limit process has some peculiar features. For 
example 

ARt ~ zizas/At is of the order of s/At 
If f{t) is a differentiable function, then 

A/(t) ~ fit) At is of the order of At 



Now when At is small, we see that VAt is much larger than At (Take, e.g. At = 10 and 
note that y/At = 10~" = 10" Af.) It follows that Rt cannot be differentiable as a function of 
t. 

The probabilist will immediately want to know the distribution of Rt- Let u{t, x) be the 
density of the random variable Rt, i.e. 

u{t, x)Ax ?a F{Rt e[x,x + Ax]) 

At time t + At the random walk can reach the point x in two ways: It can move right from 
the point x — Ax at time t, with probability p, or it can move left from the point x + Ax, 
with probability q. Thus 

u{t + At, x) = pu{t, X — Ax) + qu{t, x + Ax) 

Now we Taylor expand up to order At. Firstly 

u{t + At, a;) « u{t, x) + ut{t, x)At + o{At) 

Next, 

u{t, X lb Ax) = u{t, x) =b Ux{t, x)Ax + ^Uxx{t, x)Ax'^ + o(Ax^) 

Here, we have taken a second-order Taylor expansion, because Ax is of the order V~Ai, and 
Ax^ of the order At. Putting these together, we obtain (at the point {t, x)): 

u + UtAt = (p + q)u + {-p + q)uxAx + ^(jo + q)uxxAx^ 

However, we know that p = ^(1 + ^Vt) and that Ax « aVAt and p,q ^ ^. Hence 

UtAt = -{l^\/At)ux{(T\/At) + \uxx(y'^At 

which yields the following partial differential equation for the density of Rt. 

Ut = -fiUx + ^Ct'^Uxx 

However, the PDE is not sufficient to determine the density u: It has many solutions. We 
seek a solution which has the following properties: 



4 



Modelling Returns 



• For each t>0, we have Jf^ u{t, x) dx = 1, because u{t, x) is a density, and 

• ^(0, x) is rather odd: We have Ro = 0, and so 

/oo 
f{x)u{0,x)dx 
-oo 

i.e. u{0, x) is a "function" with the property that f{x) dx = /(O) for every function 
/. The "function" with this property is called the Dirac delta 5q. It is not a function 
at all (but the simplest example of a so-called generalized function or distribution (in 
the sense of Schwartz).) Nevertheless, wc can get some intuition as to how u ought to 
behave. We see that for t close to 0, the density u(t, x) must be very small for x / 0, 
because Rt must be close to x when t is near zero. Yet the area under the curve is 1, 
i.e. u{t, x) must be extremely peaked at around x = and then rapidly drop off. We 
may thus think off u(0, x) = (5o as a "function" which has 

/oo 
5{j{x) dx = 1 
-oo 

Oddly enough, we can find such a function. The PDE for the density, derived by Einstein 

in 1905, is a version of the heat equation, derived by Fourier, which governs heat transfer. 
So this PDE was not new: It had been intensively studied by physicists, with u{t, x) playing 
the role of the temperature at time i at a point x in an infinitely long rod. The fundamental 
solution or Green's function of such a PDE was well-known 

1 _(x-^ltf 

U{t,x) = e 2a'^t 

V2^t 

You can verify by direct differentiation that this function does, in fact, satisfy the PDE. You 
will also immediately recognize it as the density of an N{fit, (T^t)-random variable. Further- 
more, for t near 0, such a random variable has very small standard deviation, and thus the 
density is extremely peaked around 0, just as we require. 

It follows, therefore, that the density of Rt is N(iJ,t,a^t). Of course, the Central Limit 
Theorem states that, subject to a moment condition, large sums of i.i.d. are roughly normally 
distributed, so wc are not surprised. But here, we have in essence given a proof of the Central 
Limit Theorem by PDE methods, at least for random walks of the type described. 

When we take /j, = and a = 1, we obtain one of the basic building blocks of financial 
modelling: 

Definition 1.1.1 Standard Brownian motion is a continuous-time stochastic process Bt,t > 
with the following properties: 

(1) Each change 

Bt — Bs = {Bs+h — Bg) + {Bs-\.2h — Bs+h) 
+ ... + {Bt- Bt-h) 
is normally distributed with mean and variance t — s. 

(2) Each change Bt — Eg is independent of all the previous values B^, u< s. 

(3) Each sample path Bt^t > is (a.s.) continuous, and has Bq = 0. 



Brownian Motion 



5 



Now put 

Rt = nt + aBt so that St = Soe^^^"^' 
It then follows easily that 

Rt ~ N{jit, ah) 

i.e. the standard Brownian motion can also be used to to model returns processes where 
IJ^ and C7 7^ 1 . The process Rt is called an arithmetic Brownian motion with drift rate /j, 
and variance rate a^. We will also refer to a as the volatility. The process St is a geometric 
Brownian motion, and is lognormally distributed (i.e. each log St is normally distributed). 

1.2 Definition and Existence of Brownian Motion 

A Brief History: 

• Brown 1828 — pollen suspended in water jiggles about as if alive. 

• Bachelier 1900 — Theorie de la Speculation. 

• Einstein 1905 — Evidence for existence of atoms. 

• Wiener 1923 — Existence of Brownian motion as mathematical object. 

Definition 1.2.1 A 1-dimcnsional stochastic process {Bt : t > 0) is called a standard Brow- 
nian motion or a Wiener process if and only if 

(i) Continous sample paths: The map 1 1->- Bt{oj) is continuous for almost all oj. 

ill) Independent increments: Given Q < t^ < ti < ■ ■ ■ < tn, the random variables 

Btk-Bt^_^ k = l,...,n 

are mutually independent. 

(iii) Normally distributed increments li < s < t, then 

Bt-Bs^N{0,t-s) 

(iv) (Not essential) Bq = a.s. 

If we have So = x, then we refer to {Bt)t as a standard Brownian motion starting at x. 

□ 

Condition (iv) is convenient to develop the theory. In practice, we may want a Brownian 
motion to start at some other point x (or even at a random variable Bq). Once you've 
understood some of the properties of Brownian motion, you will realize that this doesn't 
harm the theory in any way. 

If you peruse some of the results that follow, you will perceive that Brownian motion has 
very, very strange properties indeed. You may end up doubting whether such a creature can 
possibly exist. Thus, without further ado: 



6 



Definition and Existence of Brownian Motion 



Theorem 1.2.2 (Wiener) 

Brownian motion exists 
Let the sample space O he C[0, oo), the set of all continuous functions from [0,oo) to M. Let 
the T he the a-algehra on which is generated hy the projection mappings Tit ■ C[0, oo) — )• M, 
which are given hy 

TTf.f^ fit) 

Then for every x £ M there exists a probability measure on (0,, T) such that {TTt)t is a 
Brownian motion starting at x. 

H 



Why did we fix the sample space to be f2 = C[0,oo) in the theorem above? Brownian motion has a 
natural home, i.e. there is a canonical probability space on which Brownian motion is defined. As above, let 
= cr{TTt : t > 0) he the smallest cr-algebra on C[0, oo) which makes every projection measurable. Then 
(7i"t)t>o is a sequence of R^-indexed random variables, and can therefore be thought of as a stochastic process 
on the measurable space (C[0, oo), 

We now show the following remarkable fact: Given that a Brownian motion (-Bt)t (starting at 0) exists on 
some probability space {Cl,J^,¥), we can define, for each a; G R, a measure on {C[0,<x),J^) such that {iTt)t 
is a Brownian motion on (C[0, oo), J^, P^), but starting at x. 

Here's how we do this: First note that we can regard _B as a function from Cl to C[0, oo). For each to £ Cl, 
B{uj) can be thought of as a continuous function whose value at t is Btifi)). Thus 

B -.Cl-^ C[Q,oo) B{uj) -.t^ Bt{uj) 

(If necessary, redefine Bt{i^) to be zero on the nuUset of those uj for which the sample path {Bt{i2>))t is not 
continuous). Now it A £ then A C C[0, oo), i.e. ^4 is a set of continuous functions. Define the measure P" 
on T by 

r°{A) = P({i2; G Cl ■ B{Cj) G A}) 
i.e. P" = P o _B~\ so that P° is essentially the distribution of B under P. In particular, if C G S(R), then 

P"(7rt eC)= f{B G nt\C)) = P(7rt(B) G C) = f(Bt G C) 

It follows that TTt has the same distribution under P" as does Bt un der P. We now show that {TTt)t is a 



Brownian motion on (C[0, oo), J^, P"), i.e. we check (i)-(iv) of Definition 1.2.1 Since the sample space consists 
of continuous functions, we will write / for a sample point, rather than uj. 

(i) For a sample point /, the map t i— >■ nt{f) is just the map / (because irtif) ~ fit)), and thus certainly 
continuous. Thus every sample path is continuous, and not merely almost every sample path. 

(ii) Since vrt — tts has the same distribution under P" as does Bt — Bs under P, it follows that vrt — tts is 
N{0,t~ s) under P°. 

(iii) Just like (ii). 

(iv) P''(7ro = 0) = P{Bo = 0) = 1, i.e. no = x almost surely. 

The measure P° is called Wiener measure. It can be thought of as a measure on the set of all sample 
paths, and this provides a very intuitive way of looking at things. For example, since -kq — a.s., the set of 
continuous functions / with /(O) 7^ is a P"- nuUset. Similarly, we shall see later, the set of differentiable 
paths is a nuUset, because a Brownian sample path is nowhere differentiable almost surely. 

Now let 2; G K, and define P^ on (C[0, 00), a(TVt : i > 0)) by 

p"(j') =r'\F-x) 

Here F is a set of continuous functions, and F~x = {f — x:f£ F} is another set of continuous functions. 
It is easy to see that _F — a; is measurable if F is, so that this definition makes sense. Note now that 

P^(^o ^x)^ P^({/ : /(O) = x}) = P«({/ - X : /(O) = x}) = ¥\{g : g{0) = 0} = 1 

It is now easy to see that {nt)t is a Brownian motion starting at x. 

Thus the same process is a Brownian motion under each P^, but it starts at a different place under each 
measure! 



Brownian Motion 



7 



Remarks 1.2.3 Wo have two candidates for "natural" a-algebra on C[0, oo). The first is = a{nt t > 0), 
the cr-algebra generated by the projections, used above. The second is obtained via topological considerations. 
We can define a metric p on C[0, oo) as follows: First define p„ on C[0, oo) by 

Pnif,g) = sup \f{t)-g{t)\ 

0<t<n 

Then define 

Then p is a metric on C[0,oo), and thus induces a topology. This topology is called the topology of uniform 
convergence on compact sets, because /„—>■/ in this topology if and only if /„ — > / uniformly on every 
compact set. 

Now any topological space X has a natural cr-algebra, namely its Borel algebra. This is the smallest 
cr-algebra on X which contains every open set. In our case, the Borel algebra B of the topological space 
C[0, oo) coincides with the algebra J^, i.e. our two "natural" a algebras coincide. 

To see that this is so is an exercise. First show that each nt : C[0,oo) — > E is continuous (i.e. that if 
p{f,g) is small, then \ f{t) — g{t)\ is small as well). This shows that (- B. Next suppose that fo £ C[0,oo). 

Note that pn(/o, /) = sup \fo{q) — f(t)\ and conclude that p„, hence p, is J" measurable. Further note that 

Qn[0,n] 

C[0, oo) is separable, and thus has a countable dense set {/„ : n € N}. Finally, let F be a closed subset of 
C[0, oo), and note that F = {f £ C[0, oo) : inf„ p(/, /„) = 0}. This proves that T contains every closed subset 
of C[0,oo). 

□ 



We now say a few words about the measures P^. Let 



Pt{a,b) = 




be the density function of a N{(},t)-random variable. If Ai G H(M), then clearly 

J Ax 

We see that Pti{x,y) dy can be thought of as the probability that Bt moves from point x at 
time t = to point y at time t = ti. The probability of ending up in the set Ai is then given 
by summing over all y e Ai, yielding the above integral. 

Next, since B^^ and Bf^ — Bf^ are independent when t2 > ti, we have 

¥''{Bt^ e Ai,Bt2 e A2) = / ptx{x, xi)pt2-ti{xi, X2) dx2 dxi 

We can interpret ptx{x,xi)pt2-tx{xi^X2) dxi dx2 as the probability that Bt moves from point 
X at time t = to point X2 at time t = t2, via point xi at time ti. For this reason, the 
function pt{x,y) is known as the transition density: It governs how Brownian motion moves 
from point to point. 

This generalizes: Let = to < ti < ■ ■ ■ < tn, and also let xq = x (the starting point). 
Then the joint density function of {Bf^ , . . . , Bt^ ) is given by 

n 

^{Bto =Xo,...,Bt„= Xn) = Yi Ptk-tk-i {xk-l,Xk) dxi ... dXn 

k=l 



8 



Definition and Existence of Brownian Motion 



so that, for example, 



'{Bt^GAo,...,Bt^£An)=lAoix)f ... f flpt, 



^-t^_^{xk-l,Xk) dxn . . . dxi (*) 



Note that the a-algebra F — cj{nt : t > 0) on Wiener space is generated by the 7r-system of sets of the 
form {-Bio G ^0, ■ • • , Bt„ G An}. Thus (*) shows how is defined on a 7r-system generating J^. Kolmogorov's 
Extension Theorem and Kolmogorov's Continuity Theorem allow us to extend (*) to prove the existence of a 
Brownian motion on C[0, co). 

We can also generalize the definition of the measures P^: Let /i be a probability distribution, i.e. a 
probability measure on (R,Z3(R)). Define a measure P^ on (C[0, oo), (j(7rt : t > 0)) by 

P^(F) = J P"(F) dn 

Then {Trt)t is a Brownian motion with the property that ttq is /i-distributed. To see this, first note that, for 
each F, the map 

X y-^ P"^(F) 

is Borel measurable (i.e. measurable from (R,B(R)) to (R, Z3(R))): Certainly, if F is of the form F — {ntg G 
Ao, ■ ■ ■ , 7rt„ £ A„}, where — to < ti < ■ ■ ■ < t„, we see that 



¥^{F) = Iao{x) .../ p{ti;x,yi) . . .p{tn -t„-i;y„-i,yn) dyi . . . dyn 
J Ax Ja„ 

so that X I— > P^(_F) is measurable. This proves that x P^(F) is measurable for all F in a vr-system which 
generates cr(7ri : t > 0). It now follows easily from Dynkin's Lemma that the map is measurable for all F. 
Next, note that 

P^lvro &A)=( P"(7ro eA)d^l = [ Ia(x) dfi = fi{A) 



This proves that ttq is /i-distributed under P**. Now it is clear that (7rt)f-sample paths are continuous. 
Furthermore, P''(7rt - tt^ G A) = J'^'^i'^t - tvs ^ A) dfi ^ J fiz{A) dfi = fiz{A), where Z is N{0,t- s), and fiz 
is its distribution. It follows that nt — tts has the distribution (under P'') of a A'^(0, t — s)-variable. Similarly, it 



is easily shown that the process ('n't)t is Gaussian under P^. We now invoke a future result. Proposition 1.5.2 



According to this proposition, it only remains for us to show that Cov(7rt,7r3) = s (when s < t). But this is 
obvious, as it is so under each P^. 

Very often, we will work with a probability space (f], P), equipped with a filtration J^t, 
such that each Bt is J-j -measurable, and such that instead of (ii) the stronger condition 

(ii)' Bt — Bs is independent of for all < s < t 

Such a Brownian motion is called an /"j-Brownian motion. 



Definition 1.2.4 A stochastic process Xt is said to be a stationary process, or to have 
stationary increments, if and only if for any < s < t and any h > 0, the random variables 
Xt — Xs and Xt+fi — Xg+h are identically distributed. 

□ 

It is easy to see directly from the definition that Brownian motion is a stationary process. 
Moreover, the increments over disjoint time intervals are independent. 

Proposition 1.2.5 If Bt is a standard Brownian motion, then 

a) The processes Bt, Bf — t, and exp{dBt — ^O'^t) are martingales (with respect to the natural 
filtration). Here G M is a constant. 



Brownian Motion 



9 



b) CoY{Bs,Bt) = sAt 
Proof Exercise. 

H 

1.3 Characteristic Functions and Independence 

Suppose that {Xi, . . . is a random vector on a probability space (O, J^, P). Recall that 
the characteristic function ipxi,...,x„ : ^ C is defined by 

The most important fact about characteristic functions is the following: 

Theorem 1.3.1 (Levy) Two random vectors have the same distribution if and only if they 
have the same characteristic function. 

□ 

The proof of the above theorem may be found in almost any advanced text on probability 
theory. 

As a simple corollary, we have the following useful result: 
Corollary 1.3.2 Two random variables X,Y are independent if and only if = 

(pxis)(pY{t). 

Proof: If X, Y are independent, then 

ipx,Yis, t) = E[e^(*^+*'^) = E[e''^e^^^] = E[e'^^] ■ E[e^*^] = <fx{s)ipY(.t) 

Conversely, if suppose that 99x^y(s,i) = ipxis)fY{t). Let X,Y be independent random 
variables such that X has the same distribution as X, and Y as Y. Then ipx = and 
= fY- Thus 

(px,Yis,t) = ipx{s)fY{t) = (/7^(s)(/?y(t) = <^x,y(s,t) 

as X, Y are independent. It follows that {X, Y) and {X, Y) have the same characteristic 
function, and thus the same distribution. In particular, X is independent of y, because X is 
independent of Y . 

□ 

Theorem 1.3.3 (Kac) A random vector X is independent of a a-algebra Q if and only if 

E[e^*-^|g] = E['*-^] allt 
Proof: (=>) is a basic property of conditional expectation. 

(<^=) We will prove it for the one-dimensional case. Let Y be any ^-measurable random 
variable. Then, using the properties of conditional expectation, 

(/'x.yls,*) =E[e^("^+*^)] =E[E[e^(^^+*^)|g] ] = E[e^*^E[e^"^|g] ] = E[e^*^]E[e^"^] = ^x{s)^Y{t) 
Hence X is independent of every ^-measurable random variable, and thus independent of Q. 

□ 



10 



The Multi-Normal Distribution 



1.4 The Mult i— Normal Distribution 

We begin by gathering some results about the (multivariate) normal distribution. Recall: 

Definition 1.4.1 Let P) be a probability space. A random variable X : Q — )• M is 

normal (or Gaussian) if X has a density function of the form 

fx{x) = — T^exp 



i.e. if for all A G B{M) we have 



F{XgA)= [ fx{t) 

J A 



dt 

□ 

It is well-known that 

EX = ^ Var(X) = a"^ 
and we say that X \s N{^,a'^). It is also well-known that the characteristic function of X is 

Definition 1.4.2 A random vector X = (Xi,X2, . . . ^X^)^ : — )• M"' is called multivariate 
Gaussian or (multi)normal if X has a density of the form 

1 / (x-^)^S-i(x-/i)^ 
^"^"^ = (2vr)^/Vdet(E) I 2 

where is a d-dimensional vector, and S a symmetric positive definite d x d-matri^l^ It is 
easy to check that in that case 

// = EX= : 

and 

/Cov(Xi,Xi) Cov(Xi,X2) ... Cov(Xi,Xd)^ 
E = E[(X-/.)^(X-^)]= : 

\Cov(Xd,Xi) Cov(Xd,X2) ... Cov(Xrf,Xrf), 

We say that X has mean ^ and covariance matrix S. 

□ 



"'a square matrix A is non-negative definite if x"^ Ax > for any x. It is positive definite if it is non-negative 
semidefinite and -k^ Ax. — only if x = 0. Note that if X is a random vector with covariance matrix S, then 
a"^Ea = Var(a ■ X) > 0, so that S is positive definite. That E is symmetric is obvious. 



Brownian Motion 



11 



Next, we examine the characteristic function of a multinomial random vector. If X = 
(Xi, . . . , Xd) is a random vector, its characteristic function defined by 

where Px is the distribution of X, i.e. a probability measure on {R'^,B{R'^)) satisfying 
Px(^) = P(X e A). 

Proposition 1.4.3 //X is multinormal with mean vector fi and covariance matrix T,, then 
its characteristic function is given by: 

Proof: An exercise in integration. 

H 

Remarks 1.4.4 Some sources define a random vector X to be normal if and only if there 
is a non-negative definite matrix S such that the characteristic function of X is given as 
above. This extends the concept of a multinormal vector slightly, as it does not require S to 
be invertible. 

For example, if Xi is A^(0, 1), and X2 = is constant, then ipji_{ti,t2) = e~2*i^ and so 
S = (g q) . Then S is not invertible, and thus X is not multinormal in the sense of Definition 
1.4. 1[ but it is multinormal in the extended sense. 

□ 

Proposition 1.4.5 Let : — )• M for i = 1, . . . , d. Then X = (Xi, X2, . . . , X^) is multi- 
normal if and only if 

XiXi + X2X2 + • • • + XdXd 
is (univariate) normal for all Xi, X2, ■ ■ ■ , Xd G IR- 

Proof: Recall Levy's Inversion Theorem: If two random vectors have the same characteristic 
functions, then they have the same distributions. Let fi be the mean and E the covariance 
matrix of X. 

Now suppose that XiXi + X2X2 + • • • + XdXd is multinormal for all A = (Aj) G M'^, and 
define = A • X. Clearly 

E[Zx\ = X-fi Var(ZA) = A^SA 
Now because Z\ is univariate normal, its characteristic function is given by 



12 



The Multi-Normal Distribution 



In particular, substituting t = 1, we get 

Now note that V9x(A) = E[e*'^'^] = E[e'^^]. It follows that X has the characteristic 
function of a multinomial random variable with mean /i and covariance matrix S. By Levy 
Inversion, this means that X is a multinormal random variable with mean /z and covariance 
matrix S. 

Conversely, suppose that X is multinormal (with mean fi and covariance matrix E), and let 
A G M'^. Then fzA^) = E[e*^-^*] = (/Jx(At) = e^^-'^*-i^'^^^*'. Thus Zx has the characteristic 
function of a (univariate) normal random variable, and is therefore normal. 

It is now easy to prove the following: 
Corollary 1.4.6 /f X = {Xi,X2, ■ . ■ ,Xii)'^ is multinormal, then each is normal. 

H 

The converse of this is false, as you will see in the next exercise. 

Recall that two random variables X, Y are called uncorrelated provided that 

EXY = EX -EY 

In that case their correlation coefficient pxY is zerc[^ It is well-known that independent ran- 
dom variables are uncorrelated, but the converse is not true: Uncorrelated random variables 
need not be independent. However 

Proposition 1.4.7 Suppose that X = {Xi, . . . , X^)'^ is a multinormal vector. Then Xi, . . . , X^ 
are mutually independent if and only if they are uncorrelated, i.e. if and only if the covariance 
matrix is diagonal. 

H 

Before we prove this proposition, it will be handy to note: 

Lemma 1.4.8 Let X = (Xi, . . . , X^) be a random vector. Then Xi, . . . , X^ are independent 
if and only if 

VJx(il, ■■■,td) = VXiitl) fxA^d) 

i.e. iff the characteristic function of the random vector can be factorized as a product of 
characteristic functions of the individual components. 

Proof: For simplicity of notation, we prove this for the case d = 2. Clearly, if X, Y are 
independent, then (/3x,y(s5i) = Vx{s)(pY{t). 



Brownian Motion 



13 



Now suppose that = ^x{s)(pY{t) for all s,t G M. Then 

e'^'^'+'y^ Fx,Yidx,dy) = ^x,Yis,t) 

= (fx{s)fY{t) 

Asx+ty) dFx 'S)FY{dx,dy) 

(using Fubini's Theorem). Here ¥x,y,Fx,Fy are the distributfons of the associated random 
vectors/variables. It follows that ¥x,y = (SDPy, and thus that X,Y are independent. 

H 



Proof of 1.4.7 If the covariance matrix E is diagonal, then the characteristic function of X 
is easily seen to be factorisable in the sense of the previous lemma. 



1.5 Gaussian Processes 

Definition 1.5.1 A stochastic process {Xt : t > 0) is said to be Gaussian if and only if for 
any < ti < • • • < the random vector {Xt-^ , • • • , Xt^) is multivariate Gaussian. 

□ 

Proposition 1.5.2 An a.s. continuous stochastic process Xt (with Xq = 0) is a Brownian 
motion if and only if it is a Gaussian process withKXt = (for allt) and Cov(Xs, Xt) = sAt. 

Proof:It is an exercise that if Bt is a Brownian motion, then Cov{Bs, Bt) = s At. To see 
that Brownian motion is a Gaussian process, consider < to < " " " < ^n- We must show 
that the random vector (S^^, . . . , Bt„) is multivariate Gaussian. For this, it suffices to show 
that XoBtg + • • • + \dBt^ is normal for any Aq, . . • , An € M. But XoBt^ + • • • + XnBt„ can 
be rewritten as ai{Bt^ — Bt^) + . . .an{Bt^ — Bt^_^), a sum of independent normal random 
variables. But sums of independent normal variables are normal. Hence Brownian motion is 
a Gaussian process. 

Conversely, suppose that Xt is an a.s. continuous Gaussian process with EX^ = and 
CoY{Xs,Xt) = s At. Note that if s < t, then \ai{Xt - Xg) = t - 2s + s = t - s, so that 
Xt — Xg ~ N(0,t — s). It remains to show that Xt has independent increments. So let 
< tQ < ■ ■ ■ < tn, and let Yk = Xt,, — Xt|^_-^^. To show that the Yk are independent, it suffices 
to show that the covariance matrix S of the multinormal vector Y = (Yi, . . . Yn)'^ is diagonal. 
Now if i < j, then 

= Cov{Xt^,XtJ - Cov{Xt^_„Xt^) - Cov{Xt^,Xt^_,) + Cov{Xt^_„ Xt^_,) 
= 

Because S is symmetric, we also have Sjj = for i > j, i.e. Sjj = whenever i ^ j. Hence 
S is a diagonal matrix. 



14 



Some Useful and Interesting Properties 



1.6 Some Useful and Interesting Properties 

For those brought up on a diet of calculus and smooth functions, Brownian motion has many 
weird and counterintuitive properties, some of which are described below. (Indeed, in the 
19th century, some of these would have been regarded as a "proof" of the non-existence of 
Brownian motion.) 

Proposition 1.6.1 Suppose that Bf is a standard Brownian motion. Then so are 

(1) Bf = cBf/^2 for c 7^ (Scaling) 

(2) Bt = tBi for t > (Time Inversion ) 
Bo = o' 

(3) Bt = {Bf^a — Ba ■ t > 0) for any a > (Time Homogeneity) 

Proof: We prove only (2), and leave (1), (3) as exercises. Let Bt = tBi for t > 0, and put 

Bq = 0. It is clear that & is a Gaussian process (because Bt is), and that EBt = for all t. 
Moreover, if < s < then Cov{Bs, Bt) = ts{l Aj) = Y = s = sAt. Thus the only thing 
that still needs to be proved is that Bt is continuous. This is certainly obvious for t > 0, 
because Bt is continuous. We need only prove, therefore, that Bt is continuous &t t = 0, i.e. 

that limtBi = a.s. 

40 t 

A quick way of seeing this is by the Strong Law of Large numbers: If A'^ is large, then 

Bn = {Bi — Bq) + {B2 — Bi) + . . . {B]sf — -Bjv-i) 

is the sum of a large number of identically distributed random variables. Thus B^/N 
a.s. as N ^ 00. Put t = and use continuity of Bu for > to conclude tBi/t — > as t J, 0. 

A more roundabout way of seeing this is as follows: i?f — t- as t J, if and only if for 
every n G N there exists m G N such that sup 15^1 < ^, and, by continuity, this is the case 

0<s<- 

m 

if and only if for every n G N there exists m G N such that sup \Bq\ < ^. Thus 

o<g<:^,qeQ 

F{lunBt = 0) = nf](j n {\B,\<1}) 

Now for some subtle reasoning: If we can show that Bq,Bq are identically distributed for 
q > 0, then 

nnu n {i^.i<^})=nnu n {i^.i<^}) 

But the right-hand side is just F(\imBt = 0), and this equals 1, since Bt is a.s. continuous 
at i = 0. 



Brownian Motion 



15 



Hence if Bq, Bq are identically distributed for q > 0, then P(limBt = 0) = P(limBt = 0) = 

1, as required. But we know that Bt is Gaussian with the same means and covarianccs as B^, 
so that Bq,Bq do indeed have the same finite-dimensional distributions. This concludes the 
proof. 

H 

Note that the scaling property can also be usefully phrased as follows: 
For any a > 0, ^Bat is a Brownian motion. 

You also have to be careful about Brownian motions relative to a filtration here. For 
example, if Bf is an ^j-Brownian motion and c > 1, then cBf^^p is not a Tt~ Brownian 
motion: If c > 1, then t/c^ < t, and thus Bf/^2 — Bg/f.2 need not be independent of Tg- (Take 
c = 2,t = 2, s = 1 for example.) 

Proposition 1.6.2 A standard Brownian motion Bt has the following property: 

F{snp Bt = +00, inf Bt = -oo) = 1 

Thus, with probability 1, a Brownian sample path will eventually exceed all bounds, positive 
and negative. 

Proof: Let Z = supt>oBt, and let c > 0. Recall that Bt := cBt/c^ is a Brownian motion 

also — the scaling property. Similarly Bt := Bt+i — Bi is a Brownian motion also — time 
homogeneity. Hence Z := sup^iJt, Z := sup^Bj, and Z all have the same distribution. 
Now 

F{Z <a)= P(sup Bt < a) 
t 

= F{supBt/^2<a) 
t 

= P(sup Bt < ca) 
t 

= F{Z < ca) 
= F{Z < ca) 

So for any c > and any a G M, we have F{Z < a) = F{Z < ca). It follows that F{Z < 
a) = F{Z < b) for any < a,6 < oo, and thus P(0 < Z < oo) = 0. Since Z is necessarily 
non-negative (because Bq = 0) we have F[Z = 0) + F{Z = oo) = 1. 

But F{Z = 0) < F{Bi < and 5„ < for all u > 1) as {sup^ Bt = 0} C {Bi < 0}n{5„ < 
0, all u > 1}. Now Z has the same distribution as Z, and therefore takes values and oo 
only. But if Z{u}) = oo, then obviously sup„>;^ i?„(cj) = oo also. Hence if Bu{oo) < for all 
u>l, then Z{u) = 0, and hence {^i < 0,5„ < 0,all u > 1} C {5i < 0, Z = 0}. But 
as Bi and Bt+i — Bt are independent for all t, we see that Bi,Z are independent. Hence 
F{Z = 0) < F{Bi < 0)F{Z = 0) = lF{Z = 0). We conclude that F{Z = 0) = 0, so that 
P(Z = oo = 1). 

Similarly F{mitBt = — oo) = 1, and thus P(supj>o-Bt = +oo, inft>o -Bj = — oo) = 1 as 
well. 

H 



16 



Some Useful and Interesting Properties 



Corollary 1.6.3 lfh>0, then F{Bt = i.o. for < t < /i) = 1. 

Thus if a Brownian motion crosses the t-axis, it will do so again infinitely often in any 
succeeding time interval, no matter how small, with probability 1. 

Proof: Exercise. 

[Hint: Use the previous proposition, time inversion and time homogeneity.] 

H 

Thus Brownian motion is extremely "bouncy" , and this bouncyness is what leads to the 
difficulties in the definition of the stochastic integral later on. 

Definition 1.6.4 A stochastic process Xt is said to be a-self-similar for some a > if and 
only if for any c > and any <t-i < ■ ■ ■ <tn the rasnom vectors 

c''{Xt„...,XtJ and iX,t„...,X,tJ 

are identically distributed. 

□ 

Note that Brownian motion is |-self-similar, by the scaling property. 

Proposition 1.6.5 If a stochastic process Xt is a-self-similar and has stationary increments 
for some < a < 1, then for any to > we have 

\Xt — Xto\ 

lim sup = oo 

4*0 t — to 

with probability 1. 

It follows that Xt is not differentiable at to with probability 1. 



Proof: Because Xt is stationary, we may, by shifting if necessary, assume that to = 0. Choose 
a countable sequence t„ J, 0. By self-similarity, 0°Xt and Xot are identically distributed. Thus 
Xo = a.s. 

Now if a; > 0, then 



lim sup 

n->oo o<s<t„ 



Xs 


> x^ 


= hm F 


( sup 


Xs 




s 




n— >-oo 


\0<S<tn 


s 





> lim sup ] 

n— >oo 



X. 



tn 
o-li 



> X 



= limsupF(t^-^|Xi| > x) 



= 1 



since a 



1 < 



0, so that t^-^ t oo. Thus P ^limsup^ 



oo^ = 1 for any sequence 



Brownian Motion 



17 



Corollary 1.6.6 Given to > 0, then with probability 1, Brownian motion is not differentiable 
at to. 

H 

In fact, a stronger result is true: 
Theorem 1.6.7 Almost surely, a Brownian motion sample path is nowhere differentiable. 

This result is harder to prove, and can be omitted. 

Proof: By the self-similarity of Brownian motion, it clearly suffices to prove the result on 
[0,1]. 

Suppose that a; G and t > are such that Bf{uj) exists. Then both the left- and right 
derivatives exist (and are equal). We prove that the set of to for which the right derivative 
exists at some t has measure 0. 

Now if the right derivative of B{ljj) exists at time t, then there is N £N such that 

HO h 

i.e. |i?j_|_/i(a;) — Bt{uj)\ < Nh for all sufficiently small h > 0. In particular, there exists an 
m G N such that < /i < ^ implies \Bt^h{uj) — Bt{uj)\ < Nh. For reasons that will become 
apparent in a moment, put n = 4m. Then we have shown that for any co and any t, if the 
right derivative of B{u!) at t exists, then we can find N and n such that 

4 

0<h<- implies \Bt+h{uj) - Bt{uj)\ < Nh 



Now define 



4 

An,N = {^*^ '■ \Bt+h{^) — ^t{^)\ < Nh wheneverO < h < —} 



It is now clear that 

{co : 3t{B[{uj) exists)} c\J\J An,N 

N n 

Indeed, the set of uj for which B{u) has a right derivative at some t is contained in the union 
on the right-hand side. 

For future reference, note that An,iv is increasing in n, i.e. 

Al,N ^ A2,N C • • • C An,N C . . . 

To prove the theorem, it suffices to show that each An,N has measure 0. So fix n, iV G N and 
define 

A„(A;) = Bk+i - Bk 

71 n 

Further define 

fcj" = inf{A; G N : ^ > t} 

Note that <t<^ < < < and that - t < i±^. By the triangle 

n — n n n n n n ° 

inequality, therefore, 

|An(fer+j)l<^ forj = 0,l,2 



18 



Some Useful and Interesting Properties 



For example, 



+ 2)1 = \Bk^+3 — 1 

n n 

< \Bky-+3 — BA + \BkY+2 — BA 



It follows that 



and so 



But 



Hence 



n n 



An,NQ(jf]\\Anik+j)\<—\ 



fe=0 j=0 



F{Ar,,N)<J2^\f]{\An{k+j)\< 
1 — n \ ^ — n V 



fe=0 \j=0 



7N 
n 



n{iM*..)i.^}) 



P |A„(A:)| < 



7N 



n 



A„(A; + 1)| < 



7N 



n 



A„(A; + 2)| < 



7N 



n 



(by independence of increments) 



Z\ < 



7N 



where Z ~ iV(0, 1) 

3 



2tt Jo 



e 2 dx 



< ( "^^^ ^ " 



because e 2 < l 



/ IAN \ ^ 
P(^„,iV)<(n + l)(^^j ^0 



as n — >■ 00 



But since P(74„^jv) C P(74„_|_i^jv ^ P(^„+2,iv) C . . . , it is clear that we must have P(^„^iv) 
for all n, N. 



Brownian Motion 



19 



1.7 Exercises 

Exercise 1.7.1 Write a program to draw a random walk Rt := Ylk=i ^^An where t = nAt, 
and = lb Ax) = ^. Play around with the relationship between Ax and At and see what 

happens as At 0. (e.g. try Ax = VAt, At, At^, 0.2\/At etc.). 



Exercise 1.7.2 Show that if Bt is a standard Brownian motion, then Cov{Bs, Bt) = s At 
(where s At := min{s, t}.) 

Exercise 1.7.3 Suppose that B^ is a standard Brownian motion. Then so are 

(1) Bt = cBt/^2 for c / 0. 

(2) Bt = Bt+a - Ba for any a > 0. 

Exercise 1.7.4 Show that if /i > 0, then ¥{Bt = infinitely often for < t < /i) = 1. 
[Hint: Use Proposition 1.6.2 and time inversion.] 

Exercise 1.7.5 Suppose that Bt is a standard 1-dimensional Brownian motion. Prove that 
Bt, B^ — t and exp(0i?t — \0'^t) are all martingales (with respect to the natural filtration). 
Here G M is constant. 

Exercise 1.7.6 Give an example of two uncorrelated normally distributed random variables 
X, Y that are not independent. 

[Hint: Let a > be an as yet unspecified real, and let X ~ A^(0, 1). Define Y as follows: 



Show that Y ~ -A^(0, 1) as well. Further, 

2 



X if|X|<a 
-X else 



(•a POO 

/ x2e-^'/2 dx - / x2e-"'/2 dx 

Jo J a 



Now show that it is possible to choose a so that EXl" = 0. Show that X, Y are uncorrelated 
random variables, but that X,Y are not independent. Conclude that {X,Y) is not bivariate 
normal.] 

Exercises 1.7.7 (1) Show that if X,Y are independent normal random variables, then 
{X, Y) is bivariate normal. Conclude that X + Y is normal, as well. 

(2) Show that the a.s. limit of multivariate Gaussian vectors is multivariate Gaussian. 

[Hint: (2) Suppose that X" = (Xf , . . . ,X^) and that X" X a.s. You must show that 
X = {Xi, . . . ,Xk) is multivariate Gaussian. Clearly X" — )• X in distribution as well. Let ipn 
be the characteristic function of X" and let (p be the characteristic function of X. Because 
weak convergence is equivalent to pointwise convergence of characteristic functions, we have 
^n{6) — >■ v(^) for all 9 E M.^. From the structure of conclude, using Levy's Inversion 
Theorem, that X is also multivariate normal.] 

Exercise 1.7.8 Calculate E,[Bs\Bt = a] and Yar{Bs\Bt = a) in the case: 



20 



Some Useful and Interesting Properties 



(a) t<s 



(b) t>s 



Interpret your result in (h) geometrically. 

[Hint: (b) If fx,Y{x,y) is the joint density function of {X,Y), then 
is the density function of X given Y = y.] 



Exercise 1.7.9 Show that the a.s. limit of multivariate Gaussian vectors is multivariate 

Gaussian. 

What about convergence in probability and weak convergence? 

[Hint: Suppose that X" = {Xf,...,X^) and that X"' ^ X a.s. You must show that 

X = (Xi, . . . ,Xj^) is multivariate Gaussian. Clearly X" — > X in distribution as well. Let ipn 
be the characteristic function of X" and let 9? be the characteristic function of X. Because 
weak convergence is equivalent to pointwise convergence of characteristic functions, we have 
^n{Q) ^{G) for all 9 G M*^. Prom the structure of conclude, using Levy Inversion, that 
X is also multivariate normal.] 



Exercise 1.7.10 On pg. 10 - 12 of Diffusions, Markov Processes and Martingales^ by Rogers 
& Williams there is a construction of Brownian motion, due to Ciesielski. The aim of this 
exercise is to work through that proof, which is reproduced below. Ciesielski's method con- 
structs a Brownian motion on the unit interval [0,1]. The shifting property of Brownian 
motion then allows it to be extended to all of M+. 



Take some probability space on which there is defined an infinite sequence of independent 
iV(0, 1) random variables. For reasons that will soon be apparent, we assume that they 
are indexed as {Zk^n '■ n- G k odd, k < 2"} Now define 



9i,o{t) = 1 



2(n-l)/2 
_2(n-l)/2 





if {k - 1)2-" <t<k2- 
if A;2-" <t<{k + 1)2" 
else 



for n > 1,A; < 2", k odd. For notational convenience, let Sn = {{k,n) : k odd. A; < 2"^}, 
S = IJn>o ^'^^ ^^^^ thing to notice is that {gk,n '■ (k, n) E S} is a complete orthonormal 
system in L^[0, 1]. The orthonormality of g^^n is easy to check; and, for completeness, if 
/ G i^[0, 1] were orthogonal to all the gk,n} then F{t) = f{u) du would vanish at and 
1 (since / ± gi^o] and also at | (since / ± 51,1); and also at \, |, (since / ± 51,2,53,2), ■ • • • 
Thus F = 0, and / = 0. 



Brownian Motion 



21 



Now define fk,nip) = /o 9k,n{u) du, and the approximations -B„(-) to Brownian motion by 

n 

Bn{t) = ^ ^ Zk,mfk,m{t) 
m=0 {k,m)eSm 

Let us describe what these approximations are doing. 

The first approximation Bq is simply tZi^Q, a straight line. The next approximation 
is obtained by adding on a Gaussian multiple of which is a tent-shaped function, 
vanishing at and 1. The next approximation is obtained by adding on two Gaussian 
multiples of tent-shaped functions, which both vanish at 0, ^ and 1.. . . 
The next stage of the proof is to establish that the B^ converge uniformly almost surely. 
Indeed, for any positive constant a^, 

Pf sup >a„) 

^0<i<l ^ 

= p(sup|Zfc,,|>2(-+i)/2a„) 
^ k ' 

(since the /fc^„ are all at most 2-("+i)/2) 
< 2"-ip(|Zi,„| > 2("+^)/2a„) 
<(47r)-V22-/2a-iexp(-a22-) 

by the elementary estimate 

exp(-^2/^) dy < exp(-^a;^) 

We now aim to choose the constants a„ in such a way that 

5;2"/2a-iexp(-a2 2-)<oo 

n 

< DO 

n 

The first of these conditions will ensure that, almost surely, 

sup \Bn{t) — Bn~i{t)\ < an for all large enough n; 

0<t<l 

the second will guarantee that the Bn converge uniformly (almost surely) to a limit B, 
which is therefore continuous. But these conditions are satisfied by the choice = 
(n2~"")^/^, for example. 

Thus we have proved that, almost surely, the Bn converge to some continuous limit B, 
which we now must show is Brownian motion. As we saw. . . , the simplest way to do this is 
to check that i? is a zero-mean Gaussian process with covariance structure K(BsBt) = sAt. 



L 



22 



Some Useful and Interesting Properties 



Obviously, each Bn is a zero-mean Gaussian process: the vector {Bn{ti), . . . , Bn{tk)) 
is multivariate Gaussian. This converges almost surely (and so in distribution) to 
{B(ti), . . . , B{tk)), and the limit of the covariances of the Bn gives the covariance of 
B. But 

n 

E[Bn{s)Br,{t)] =Y, Yl fk,mi^)fk,m{t) 
m=0 {k,m)eSm 

by the independence of the Zj-^jn, and this converges as n t oo to 

fk,m{s)fk,m{t) 

{k,m)eS 

= [ ^[o,s](^^)^[o,t](«) du 
Jo 

= sAt 

since fk,m{s) = Jq -^[o,s](^)5'fc,m(^) du is the Fourier coefHcient of g^^m ttie representation 
of /[o,s] in terms of the complete orthonormal system {gk,n '■ {k,n) G S}. Parseval's identity 
concludes the proof. 



(a) Draw graphs of the functions 51,0, 51,1, 51,2, 33,2, gi,3, ■ ■ • ,57,3 to get a feeling of what 
these functions look like. 



(b) It is asserted in the proof that the family {gk,n ■ {^1 ^) G S*} is a complete orthonormal 
system in the Hilbert space £^[0, 1] = C'^{[Q, i],B[0, 1], A), where B[0, 1] is the family of 
Borel subsets of the compact interval [0, 1] and A is Lebesgue measure. 



if (/c,ra) ^ (Z,m) 

1 else 



(i) Show that the gk^n are orthonormal, i.e. show that 

{9k,ni 9l,rn) - 

Here (/, 5) is the usual inner product in £^[0, 1], i.e. 

(/,5)= C fgdx 
Jo 

(ii) In a general Hilbert space, an orthonormal system {it„ : n G N} is called complete 
if and only if for every u, we have 



00 

" = 'Y{Un,u)Un 
n=0 



i.e. the above series converges, and converges to u. Note that {un,u) is just the 
projection of u onto the unit vector u„. Show that a system of orthonormal vectors 
is complete if and only if 

{un, u) =0 for all <^=^ u = 



Brownian Motion 



23 



(iii) Haul out some old notes on Fourier series and identify the complete orthonormal sys- 
tem of functions used there. Write the Fourier coefficients of a continuous function 
/ on a compact interval as inner products. It is for this reason that the projections 
{un,u) are called the generalized Fourier coefficients of u (w.r.t the basis {un}). 

(iv) Now show that the set of gk^n is, in fact, a complete orthonormal system. 

(c) Draw the graphs of the first few tent-shaped functions /i^o, • • • , /7,3- 

(d) In the proof that the approximations Bn converge uniformly almost surely, the following 
inequality is used: 

poo 

/ exp(-yV2) dy < x'^ ex.p{-x^/2) 

J X 

Prove that this inequality holds. 

(e) Verify that the choice of the constants On = (n2~")^/^ will satisfy the two conditions 
asserted in the proof. 

(f) Now that we know that the Bn converge uniformly (almost surely) to something, it 
remains to prove that that something is a Brownian motion, i.e. a continuous Gaussian 
process B with Cov(Ss, Bt) = s At. 

(i) Why is the limit B of the Bn continuous a.s.? 

(ii) "Obviously, each Bn is a zero-mean Gaussian process." Why? 

(iii) Show that 

C0Y{B{s),B{t))= lim C0Y{Bn{s),Bn{t))= V fk,Mfk.m{t) 

{k,m)eS 

(iv) The indicator functions /[o_t] can be represented as a generalized Fourier series 

(fe,n)eS 

because the system {gk,n} is complete. Show that the generalized Fourier coefficients 
of /[o,t] are given by 



(v) Now note that 



and prove that 



s At 



JO 



XI fk,m{t)fk,m{s) = sAt 
(fc,m)e5 

(vi) We now know that i? is a continuous process with the right covariance structure. 
It remains to show that B is, in fact, a Gaussian process. Use Exercise 1.7.9 to 
accomplish this. 

The proof is now complete. 



24 Some Useful and Interesting Properties 



Chapter 2 

Martingale Theory 



2.1 Elements of Discrete— Time Martingales 



Martingales are amongst the most important objects in probability theory, and an entire sub- 
discipline of finance is based on them. Brownian motion is the most important continuous- 
parameter martingale, and is heavily used in financial modelling. In this chapter we first 
introduce the basic results about discrete-time martingales at a leisurely rate, taking time 
to build up intuition and facility with martingale calculations. In the next chapter, we will 
tackle continuous-parameter martingales. 



2.1.1 Stochastic Processes and Filtrations 



Informally, a (discrete-parameter) stochastic process X is a family of random variables in- 
dexed by a discrete time set, i.e. X = Xi, X2, X3, . . . . The idea is that these model the 
outcomes of a series of random phenomena, such as the closing values of the S&P500. The 
Xn are thus successive values of some quantity under consideration. Note that the times of 
the random variables may not be evenly distributed in physical time; for example, the share 
index is recorded only on trading days. 

We assume that the stochastic process X = [Xn : n € /) is defined on some probability 
space (yi,J^,W). The time index set / will usually be the set of natural numbers, or the 
set of non-negative reals, or some finite initial segment these. For a particular outcome 
a; G the sequence Xi{uj) , X2{oj) , ... is called a sample path of the process. Note that one 
outcome/state-of-the-world uj determines the values of all the X„. We only know the value 
of Xn at time n, and so as time n increases, so does our knowledge of the state of the world. 
Since information is organised in cr-algebras, we associate with each time n a cr-algebra Tn 
modelling the knowledge at time n. We also assume that no information is lost or forgotten, 
so that information available at time n is also available at a later time m > n. This simply 
means that Fm 2 ^n- We thus model the flow of information as follows: 



25 



26 



Martingales, Submartingales, Supermartingales 



Definition 2.1.1 Suppose that {Q,T,F) is a probability space. An increasing sequence 

^0 C J"i C • • • C C • • • C J- 

of (T-algebras on is called a filtration. We shall always assume that J^q contains all the 
sets of measure 0. 
We also define 

^oo = o-(lj^n) C T 

n 



Tn represents the available information at time n, i.e. it contains all events A for which 
it is possible to decide at time n whether A has occurred or not. 

Suppose that St is the share price at time t. We know Si a**t time t = 2. Thus each of the following 
events can be decided at time t = 2: Whether or not X2 = 5.00; whether or not X2 lies between 13.50 and 

15.76, etc. It therefore follows that X2 must be J^2-measurable, i.e. that aiX^i) C T^- Moreover, X\ is also 
known at t = 2, so a[X\,X2) C Ti- However, at t = 2 we do not know the share price at time t = 3. Thus 
Xz is not J^2-measurable, although it is, of course J^s-measurable. 

In essence, to model the fact that the value of is known at a later time n, we need 
to add the restriction that Xm is J-n~iiieasurable for all n > m. This just means that 
a{Xi, . . . , Xn) C and so we define: 



Definition 2.1.2 A stochastic process X = (Xn, n £ I) is said to be adapted to a filtration 
J>i,n G / provided that each Xn is J>i-measurable. It follows trivially that this is the 
case if and only if 

a{Xi, . . . ,Xn) C jFji 



Exercise 2.1.3 Make sure that you can prove this trivial result. 

□ 

Note that to say that X is adapted to J^n simply means that the random variables Xn do 
not contain more information than the Tn, although they may contain strictly less. 

Note also that J^n = criXi, . . . ,Xn) is the smallest filtration with respect to which X is 
adapted, i.e. that if X is also adapted to a filtration Qn, then J^n ^ Gn- The filtration J^n 
contains just the information in the values of X up to time n, and is called the natural or 
canonical filtration of X. It contains just as much information as is contained in the Xn, and 
no more. 



2.1.2 Martingales, Submcirtingales, Supermartingales 

Martingales model a fair game, submartingales a favourable game, and supermartingales an 
unfavourable game. Here is the definition: 



Martingale Theory 



27 



Definition 2.1.4 A stochastic process X = (X„ : n G N) is called a supermartingale 

(respectively submartingale) with respect to a filtration J>i, n € N if and only if 

(a) Each X„ G C\n,T,F'Q 

(b) X is adapted to J>i, n G N. 

(c) E[X„-|-i|J>;,] < Xn (respectively, E[X„,+i|J>;,] > Xn) for each n S N 

A martingale is simultaneously a sub- and a supermartingale, i.e. it satisfies 
¥.[Xn+i\J^n] = Xn for each n E N. 

When we say that X is a (super/sub-)martingale, but we don't mention a specific filtration, 
then the natural filtration should be used. 

"i.e. each Xn is mtegrable, which just means that EX„ exists, and is finite. 



(Note that we've taken N as the index set. You shouldn't have any trouble generalizing 
the definition to the case where the index set is some initial segment {0, 1, 2, . . . , T} of N). 

Think of X„ as your total fortune after the n*^ round of a gambling game. If X is a 
supermartingale, your expected fortune at time n + 1 is less than your fortune at time n. It 
follows that this particular game is unfavourable, i.e. that you are likely to lose. If X is a 
martingale, then your expected fortune equals your present fortune: You are just as likely to 
win as to lose, and the game is fair. 

Examples 2.1.5 (a) Suppose that the Xn,n G N are independent random variables with 
KXn = 0, and that J-n,n G N is the natural filtration. Define Sn = Xi + • • ■ + Xn- Clearly 
S = {Sn : n G N) is a stochastic process adapted to Tn,n £ I, and each Sn is integrable. 
Moreover, 

nSn+l\Tn] = nXi\Tn] + ■■■+ IE[X„| J^] + E[Xn+l\Tn] 

Since X^ is J-"n~measurable for m < n, it follows that E[Xm|-7>i] = X^ if m < n. 
Moreover, since the Xm are independent random variables, Xn+i is independent of Tn, 
and thus we have E[X„+i|J>n,] = KXn+i = 0. Hence 

E[S„+i| j;] =Xi + ---+Xn + = Sn 

which proves that Sn,n G N is a martingale. 

(b) If we have the same situation as in (a), but with EX„ > for all n, then Sn,n & N is a 
submartingale. 

(c) If Xn, n G N are random variables with the same mean = and the same variance cr^, 
and if Sn = Xi+X2+- • •+Xn, then the process Wn = Sn—na"^ is a martingale with respect 
to the natural filtration of the Xn ■ First note that each Wn is integrable if and only if Sn 
is, but this follows because the variances cr^ = EX^ exist, so that each X„ G C'^{Q,J-,F). 
To verify the martingale property, observe that 5"^+! = Sn + 2SnXn+i + X^+i- Further 
observe that E[S'„X„+i| J^„] = 5„E[X„+i|J^„], because Sn is J>i-measurable, and that 



28 



Martingales, Submartingales, Supermartingales 



E[Xn+i\Tn] = = 0, since Xn+i is independent oi Tn- Thus: 

E[Wn+l - Wn\Tn] = + X^+if - - a'')\Tn] 

= 2E[SnXn+l\Tn]+nXl^^\Tn] - (T^ 
= 2SnE[Xn+l\Tn] + EX^+i - 
= 2Sn ■ + Var(X„) - 

= 

(d) Suppose that X„ are non-negative random variables with EX„ = 1. Put Mq = 1, and 
define 

Mn = Xi-X2 Xn 

for n > 0. Assume that each M„ is integrable. It is left as an exercise to show that M„ 
is a martingale. 

(e) Consider a random walk. If it is symmetric, it is a martingale. If the probability p of 
going up is < 0.5, it is a supermartingale. 

(f) One more interesting martingale demonstrates the accumulation of information about 
the value of a random variable over time. Let Y be an integrable random variable (i.e. 
J^-measurable) . Wc do not necessarily know the value of Y at time n — there may not be 
enough information available. However, as time passes, we expect that our estimate will 
become more accurate. At time n, the best available approximation to F is = E[y|J>i]. 
We now show that is a martingale (with respect to the natural filtration). Firstly, 

Ey„ = E[E(y| = EY 

by the "Tower Property". This shows that each Yn is integrable if Y is. Next, 

E[Yn+i\Tn] = E[E[Y\:Fn+l]\Tn] = E[Y\Tn] = Yn 

by the Tower Property again. This proves the result. 

What this means is that there are no trends in our estimates of Y. At each new time 
step, our revised estimate is just as likely to go up as it is to go down, and is expected 
to remain at the same value as our previous estimate. This makes sense: If we expected 
our estimates to increase, for example, then our estimates would not have been the best 
available. We ought to have built the expectation of increase into our estimates already. 

(g) Note that if Xn is a martingale, and if is a convex function, then (p{Xn) is a submartin- 
gale. Indeed, 

E[ip{Xn+l)\Tn] > ip{E[Xn+l\Tn]) = ^{Xn) 

by Jensen's inequality. It follows that if Xn is a martingale and if p > 1, then \Xn\^ is a 
submartingale. 

□ 

Remarks 2.1.6 (a) If X„, n G N is a martingale, then EXn = EXq for all n, i.e. all the Xn 
have the same mean. This is an easy exercise. 



Martingale Theory 



29 



(b) We have defined the martingale property with respect to a filtration. Thus if Xn is a 
martingale with respect to one filtration, it may not be with respect to another. However, 
if Xn is a martingale with respect to some filtration Qn, and if J^n = <^{Xi, . . . , X^) is the 
natural filtration, then X^ is also a martingale with respect to Tn- To see this, first note 
that each X„ is C/„-measurablc (because X„ is adapted to Qn — part of the definition of 
martingale). Thus J^n ^ Gn for each n. It now follows by the Tower Property that 

E[Xn+l\Tn] = E[E[X„+i|g„]|J-„] = E[Xn\Tn] = X„ 

The last equality holds by "Taking out what is known", because Xn is J>i-measurable. 
It is now not hard to see that if X„ is a martingale with respect to one filtration, it will 
also be a martingale with respect to any poorer (in information) filtration to which it is 
adapted. 

(c) The converse of (b) is not true: If Xn is a martingale with respect to the natural filtration, 
it may not be a martingale with respect to a richer (in information) filtration. Find a 
simple example! 

(d) Note that if Xn is a martingale, and if m > n, then E[Xm|J>i] = Xn- This is left as 
another exercise in the use of the Tower Property. 

□ 

The following exercise will prove extremely useful: 

Exercise 2.1.7 (Orthogonality of Martingale Increments) 
Prove that if M„ is a martingale, then 

E[(M„-M^)2|^fe] =E[M2-M^|J-fe] k<m<n 

Deduce that 

n 

E[Mn]^ = EMi + J2 HiMm - M^_i)2] 

m=l 

□ 

2.1.3 Games and Strategies 

Suppose that you take part in a game of chance, e.g. a game of coin tossing, roulette, or 
investing in the stock market. The game is repeated many times, and you place a bet each 
time. Let ^n,n G N be a sequence of integrable random variables which represent your 
winnings (or losses, if negative) per unit stake in the n*^ game. Thus, if you had wagered a 
stake Cn on the ra*^ game, you would have won Cn^n- 

If you played unit stakes all the way through, your total winnings after the n^^ game 
would be 

Sn = ii + --- + in forn > 1 

Note that Sq = 0, because you haven't won or lost anything yet. 

If the game is fair then your chance of winning is the same as your chance of losing, 
and thus E^„ = 0. In that case, Sn is a martingale with respect to the natural filtration 



30 



Games and Strategies 



= . . • , £,n) = c(5'i; • • • , Sn)- Similarly, if the game is unfavourable to you, then at 
time n you expect your winnings at time n + 1 to be less than your current winnings, i.e. 
E[S'„+i|J>^] < Sn- Thus an unfavourable game is modelled by a supermartingale. A favourable 
game will clearly be modelled by a submartingale. 

Suppose now that you have a system, i.e. a gambling strategy, which tells you when to 
bet, how much to bet etc. Your system, call it C, will tell you what stake Cn you should 
place on the n}^ game. We allow negative stakes as well (which are essentially bets that you 
will lose|^ In that case, your total winnings after the n*'^ game will be 

Now note that in = Sn — Sn-i = AS^, and thus that 

n n 

Wn = Y, CkiSk - Sk-i) = CkASk 

k=l k=l 

which looks like a Riemann-Stieltjes sum. Your strategy C = {Cn : n € /) is also a stochastic 
process, but since we have to decide what stake to wager before the outcome of the n**^ game 
is known, we must be able to decide the value of Cn on the basis of information available at 
time n — 1 (i.e. after the (n — l)**^ game). Thus each Cn is J>i-i-measurable. We have a 
name for this: 

Definition 2.1.8 A stochastic process C is called previsible (or non-anticipative, or pre- 
dictable), with respect to a filtration Tn provided that each Cn is J>i_i-measurable, for 
n > 1. Note that Co is not defined. 

Thus a gambling strategy is just a previsible process. 

Consider an arbitrary adapted stochastic process Yn ■ Then in general Yn may exhibit both 
purely random behaviour and long-term trends. For example, for supermartingales the long- 
term trend is that it tends to decrease. Purely random behaviour is described by martingales, 
and trends are known beforehand, i.e. are previsible. We thus attempt to decompose Yt into 
a martingale part and a previsible part, i.e. we try to write 

Yn = Mn + An 

where M„ is a martingale, with Mq = Yq, and An is previsible, with = 0. In engineering. 
An is called the signal, and M„ the noise. 

Suppose that we can actually find such a decomposition. We would then have 

Yn+l - Yn = {Mn+1 - M„) + {An+1 - An) 

Taking conditional expectations immediately yields 

An+l -An= E[Yn+l\J^n] - Yn 

SO that 

Mn+l -Mn = Yn+l " IE[yn+l|-^n] 

We now use this pair of equations to define Mn and An in the next theorem. 

^We need negative stakes to model short sales, which are essentially just bets that a stock will lose value 



Martingale Theory 



31 



Theorem 2.1.9 (Doob Decomposition Theorem) 
Every process Yn has a unique decomposition 

Yn = Mn + An 

where M„ is a martingale with Mq = Yq, and An is previsible, null at n = 0. Moreover, if 
Yn is a supermartingale, then An is decreasing. 

Proof: Define M„, An inductively by 

f Mo = Mn+l = Mn + Yn+l - E[y„+i | J^,] 

\ Ao = 0, Ar,+i=An+Wn+l\J'n]-Yn 

It is clear that M„ is a martingale and that An is previsible. Moreover 

Ym - Ym-l = {Mm - M„_i) + {An, - Am-l) 

summing over m from m = 1 to m = n yields 

Yn = Mn + An 

as required. 

To see that this decomposition is unique, suppose that Yn = M'n + A'n is another decom- 
position with the same properties. We show by induction on n that M = M',A = A': Note 
that Mo = Mq by definition. Suppose that M„ = M^, and, consequently, that An = A'n- 
Then 

Mn+l - M'n+i = An+l - 

Taking conditional expectations with respect to we obtain 

= M„ - M; = - An+l 

because A, A' are previsible. Hence An+i = and so M„_|_i = M^_^^ as well. By induc- 

tion, we have M„ = M^, An = A'n for all tt, G N. This proves that the Doob Decomposition is 
unique. 

If Xn is a supermartingale, then E[X„_|_i|J>j] < X„, so the definition of An+i implies that 

An+l ^ An- 

H 

Exercise 2.1.10 Suppose that is a martingale. By Jensen's inequality, Y^ will be a 
submartingale, and thus have an increasing trend. The previsible trend part At of Y^ is 
called the quadratic variation, for the following reason: 

AAt = E[y,2 _ YtWTt-i] = n{Yy - yj-i) Vt-i] = n{AYtf\Tt-i] 
t 

so that At=Yl ^[{l^Ysf\Ts-i]- Prove this. 

s=l 

□ 



32 



Stopping Times and Optional Stopping 



In the continuous-time theory, the generahzation of the Doob decomposition to the Doob- 
Meyer Decomposition Theorem for submartingales is a deep result. The quadratic variation 
process associated with a submartingale is of great importance in deriving a general theory 
of stochastic integration. 

Definition 2.1.11 If C is a previsible process, and if X is adapted (both with respect 
to a filtration Tn,n G N), then the martingale transform of X by C is the process W 
given by 

Wo = 

n 

Wn = ^Ck{Xk-Xk-i) ifn>0 
fe=i 

The process W is generally denoted hy C ■ X, and Wn by (C • 

Thus the martingale transform of X by C is simply your winnings process on the game 
X using the gambling strategy C. Now comes the crunch: 



Theorem 2.1.12 (a) Suppose that X is a martingale, and that C is a bounded previsible 
process. Then C ■ X is a martingale. 

(h) If X is a supermartingale (submartingale), andC is a bounded non-negsitive previsible 
process, then C ■ X is a supermartingale (submartingale). 

Proof: (a) Let W = C ■ X. The fact that C is bounded and that each X„ is integrable 
implies that Wn is integrable as well. Then Wn+i — Wn = Cn-\-i{Xn+\ — X^). Using the fact 
that Xn and C^+i are J-"„-measurable, we see that 

nWn+l - Wn\Tn\ = C^+l [E[X„+i - X„] = 
so that E[Wn+l|J^n] = = Wn. 

The proof of (b) is left as an exercise. 

H 

This theorem has the following important consequence for games of chance: You cannot 
find a previsible trading strategy which will turn a fair game to your advantage, i.e. which 
will turn a martingale into a submartingale. No matter what your strategy, your winnings 
process will still be a martingale. 

As a final remark, note that if X is a (super-, sub-)martingale, and a is a constant, then 
y = X + a is a (super-, sub-) martingale, and moreover C ■ X = C Y. 

2.2 Stopping Times and Optional Stopping 

In many games of chance, and this includes playing the stock market, one has the option to 
quit at any time. You may have a strategy to decide when to stop, e.g. quit if you've lost 
5 times in a row, or quit if you've lost half your initial fortune. In that case, your stopping 
time depends on the state of the world, i.e. it is random. In the discrete framework, we can 
therefore regard a stopping time as a random variable r : O — )• N. If r(w) = n, then you stop 



Martingale Theory 



33 



after the n game if the state of the world is cj. Not all random variables r : $7 — t- N are 
suitable as stopping strategies, however. For example, let Wn be your winnings after the n^^ 
round of a coin-tossing game. Define 

T = n where W„ = suplW^ : m < 30} 

This is a very desirable stopping strategy. Here's why: The strategy considers a sequence 
of 30 games, and stops when the winnings are at a maximum. Thus if r = 23, then W23 is 
the largest amount you will win in this state of the world. W24, W25, ■ ■ ■ W30 are all < W23- 
Clearly the best thing to do is to stop at game 23. However, the problem is that by the time 
you reach game 23, you don't know whether W23 is the highest your winnings will ever be. 
This information is not available. Therefore not all positive integer— valued random variables 
are good stopping times. We define: 

Definition 2.2.1 A map t : Q ^ {0,1,2 ... , 00} is called a stopping time if 

(a) {t < n} = {lo : t{uj) < n} E Tn for all n < 00 
or equivalently, if 

(b) {r = n} G J-'n for all n < 00 
Intuitively, 

r is a stopping time if and only if at time n you can decide whether r = n or not. Whether 
you continue or stop depends only on the history up to, and including, time n. 

Note that we include the possibility that r = 00, i.e. that the game never stops. 

Proof that (a) and (b) are equivalent: Suppose that (a) holds, i.e. that {r < fc} G J^k 
for all k. Then 

{t = n} = {t < n} - {t < n-1} e Tn 
On the other hand, if r has property (b), i.e. if {r = k} e Tk for all k, then 

{T<n}=\J{T = k}eJ^n 
k<n 

H 

Example 2.2.2 Suppose you and an opponent play a coin tossing game, both with initial 
fortunes of RIO. 00. Let Sn denote your fortune after the lO*'^' game. Then 

T = min{n : Sn = oi Sn = 15} 

is clearly a stopping time (with respect to the filtration Tn = o-{So, -Si, ... , Sn))' You will 
stop playing either when you are ruined (i.e. when Sn = 0), or when you've won R5.00 off 
your opponent. 

Using the mathematical definition, we have 

{r = n} ={0 < S"! < 15} n {0 < S'2 < 15} n . . . 

• • • n {0 < < 15} n {{Sn = 15} u {Sn = o}) 



34 



Stopping Times and Optional Stopping 



and each of the sets on the right belongs to J-n- Hence so does {r = n}. 

In this case, r is caUed a hitting time: It is the first time the process Sn hits either or 

15. 

□ 

Exercise 2.2.3 Let X„ be a stochastic process adapted to a filtration Tn, and let B C M be 
a Borel set. Show that the time of first entry into B, 

T = min{n : X„ G B} 

is a stopping time. 

□ 

Recall the following terminology: If a, b are real numbers, then 

a Ab = min{a, b} 

The next exercise is important: 

Exercise 2.2.4 (a) Prove that if S, T are stopping times, then so are T A 5, T V 5, T + S. 

(b) Prove that if r„, n G N are stopping times, then so are sup^j Tn, inf^ lim sup„ lim inf^ 
and where it exists, lim„ r„. 

□ 

Given a stochastic process X„ (e.g. your winnings in a game of chance), and a stopping 
time T, we define the stopped value Xt to be the random variable defined by 



We also define the stopped process XJ^ to be the same as X„ until the stopping time is reached, 
and constant with value Xj- thereafter. To be precise: 



Definition 2.2.5 Let Xn be a stochastic 


process, and let r be a stopping 


time. We define 


the stopped process X'^ by 






XI{UJ) = Xr^n{<^] 


1 Xn{u!) if n < t{u}) 


□ 



It is easy to see that if Xn is adapted to a filtration then so is X^- 



□ 

In the previous section, we showed that you cannot turn a fair game to your advantage 
by choosing an appropriate betting strategy. Our next result shows that you cannot turn a 
fair game to your advantage by choosing an appropriate stopping time: 



Martingale Theory 



35 



Theorem 2.2.6 (Stopped martingales are martingales) 
Let T be a stopping time. 

(a) If X is a martingale, then so is the stopped process X'^ . 
(h) If X is a supermartingale (suhmartingale) , then so is X'^ . 



Proof: This follows from Theorem 2.1.12 First assume that Xq = 0. We will show that 
X"^ is the martingale transform of X with respect to a suitable strategy. Define a previsible 
process C by 

' if T < n 
1 if T > n 



Thus Cn = I[r>n}- Now {t > n} = {t < u — l}'^, and since r is a stopping time, we have 
{r<n — 1}g Fn-i- Hence Cn is previsible. 

Now note that if we take the martingale transform of X by C, we obtain 



{C ■ X)n = > Ck{Xk — Xk-l] 



k=l 



Xn if T > n 
Xr if T < n 



and thus that C ■ X = X'^. The result now follows from Theorem 12. 1.12[ 

We have now proved the result for the special case where Xq = 0. To prove the general 
result, Apply the special case to the (super-, sub~) martingale Yn = Xn — Xq. 



This theorem immediately implies that when X is a martingale, we have IE[X^] = IE[Xq] = 

nxo]. 

Definition 2.2.7 Suppose that r is a stopping time on a probability space (17, J-", P) 
equipped with a filtration (J^t)teT- The cr-algebra of events prior to r, denoted Jv is the 
set of all events A £ with the property that 

Ar\{T <t} e Tt for ah t G T 



The above "definition" requires (a) of the following exercise. (This entire exercise is very 
important). 

Exercise 2.2.8 (a) J> is a cr-algebra. 

(b) We can replace < by = in the above definition, i.e. 

J'r = {A e T : An {t = t} £ Tt for all t G T} 

(c) Let Xt be an adapted process. Show that both X^ and r are Jv-measurable. 

(d) Prove that if cr < r are stopping times, then Jv C J^^. 

□ 



36 



Stopping Times and Optional Stopping 



How should we interpret Jv? Roughly speaking, 

Tt consists of all events that can be decided by time r. 

This is because A G J> if and only if ^4 n {r < i} can be decided by time t, so if r(w) = t, 
then A is decidable at time t (i.e. at time r). Note that though r is random, Tt is not. 



Proposition 2.2.9 (a) {cr < r} G Jvat o,nd {cr = r} G T^m-', 

(b) A E Tt implies Ari{T < a} E TaAr o,nd ACi {t = a} E Jvat/ 

(c) If Tntr <oo, then Tr-a t • 

Proof: (a) Note that for all n, {cr < r} n {(t A r = n} = {a < r} n {cr = n} = {r > n} n {cr = 

n}, and this set belongs to Tn- 

Similarly, {r < a} G Jvat; and thus {cr = r} G J>Ar as well. 

(b) If A G Jv, then for all n, A n {r < a} n {a A r = n} = A n {r = n} n {a < n - 1}^ G Tn- 

(c) Let ^ = crdJn-^fn)- We must show Q = T-r- Now clearly ^ C ^.r; hence we need only 
prove the reverse inclusion. 

So let A E Tt- Each is ^-measurable, and thus r = lim„ t„ is ^-measurable as well. 
Since r„, r take only integer values, since t ''") and since r < oo, we must have r(a;) = TnifjS) 
for some n (which may depend on w). Thus f2 = Uni'^" = follows that 

A = ^ n IjK = r} = (J (A n {r„ = r}) 

n n 

But by (b), since A G J> and A r = r„, we see that A n {t„ = t} G Thus A E Q, i.e. 
C ^ as well. 

H 

Note: One would imagine that if r is a stopping time and if is a martingale, then E[Xt-] = 
E[Xo]. That this is not necessarily the case is demonstrated by the following example: Let 
Sn be a symmetric random walk on the integers, starting at 0, and let r = min{n : 5„ = — 1}, 
i.e. r is the first time the process Sn hits —1 (and r = oo if it never hits —1). It is clear that 
Sn is a martingale with ESn = So = 0. It is also clear that ESr = — 1, because the process 
will stop only when it hits —1. 

□ 

Theorem 2.2.10 (Optional Sampling Theorem: Bounded Case) 

Let X be a supermartingale, and let T,a be bounded stopping times with a <t a.s. Then 

E[Xr\Ta] < Xa- a.s. 
Moreover, if X is a martingale, then equality holds. 



Proof: Assume that a < t < N for some natural number N. Note that \Xr\ < \Xo\ + • • • + 
\Xn\, so that Xt is integrable. The same is true for X^-. 



Martingale Theory 



37 



Next note that 

N 

I{(T<n<T}i-^n+l — ^n) = ~ 

n=l 

Now if A G J^cr, then ACi {a <n} e Tn-, and so the set 

An = Ar\{a <n<T} = Ar\{a <n\^{T <nY 

belongs to Tn- Hence 

N 

iA{^T - ^a) = ^ lA„{Xn+l — Xn) 

n=l 

Applying the supermartingale property to An G we see that 

E[Xn+i;An]<E[Xn;An] 

which implies that 

E[X^;y4] < ]E[X^;A] 
for any A G Jv- But then E[Xt-|^o.] < X^-, as required. 

H 



Corollary 2.2.11 Let Xn be a (super) martingale w.r.t. the filtration Ft, and let Tn he 
an increasing sequence of hounded stopping times. Then Xj-^ is a (super)martingale w.r.t. 
the filtration Ft„ • 



Proof: Suppose that X^ is a supermartingale w.r.t. J-n- Let M„ = X-^^, Qn = -^r„- Then if 
m < n, we have E[Mn\Grn] = ^[XmlJ^Tm] — Xr^^^ = M„i, by the Optional Sampling Theorem 
(bounded case), since < Tn and Tn is bounded. Clearly equality holds if Xn is a martingale. 

H 

2.3 The Martingale Convergence Theorem 

The main result of this section states that bounded discrete-parameter (super-, sub-) mar- 
tingales converge almost surely. As usual, we work in a probability space ($7, T, P) equipped 
with a filtration F,, . 

Theorem 2.3.1 (Doob's Martingale Convergence Theorem) 

Let {Xn)n be a discrete-parameter supermartingale hounded in L^ , i.e. sup„E(|X„|) < oo. 
Then there is a random variable X^ G such that 

Xn — >■ Xqo a.s. as n oo 

Note that since E|Xoo| < oo, we have P(Xoo < oo) = 1, i.e. X^o is almost surely finite. Note 
also that Xn — ?• X^o a.s. This docs not mean that Xn converges to X^o in as well. In the 
next section, we shall show how to extend this theorem to C^- and /^^-convergence. 
We need a couple of new concepts before we can tackle the proof. 



38 



The Martingale Convergence Theorem 



Definition 2.3.2 Let X be a discrete-parameter supermartingale, and let a < 6 G R. 
The number Un{X; [a, b]){uj) of upcrossings of [a, b] is the number of times X crosses from 
below a to above b by time N. To be precise. Un{X; [a, b]){oj) is the largest A; G Z+ for 
which there exist intertwined sequences s„,t„ with 



such that 
Also define 



< Sl <tl < S2 <t2 < ■ ■ ■ < Sk <tk < N 

X,.(a;)<a Xt.(w) > 6 1 < i < A: 



C/oo(X; [a,5])(a;) = supC/iv(X; [a,5])(u;) 

N 



We will show that lim^X^, exists a.s. in the following manner: Suppose that lim„X„(w) 
does not exist. Then liminf„ X„(a;) < limsup„ X„(a;). Thus there exist rational numbers a, b 
such that liminf„ < a < b < limsup„ Xn{oj), and thus UooiX; [a, b])(uj) = oo. We shall 

show that this is possible only for a null set of lo. Thus the set {liminf„X„ ^ limsup„X„} 
has measure 0, i.e. lim„ X„ exists a.s. 

We first put a bound on the number of up-crossings: 



Lemma 2.3.3 (Doob's Upcrossing Lemma) 
Suppose that X is a supermartingale. 



(6 - a)EUNiX; [a, b]) < E[(Xjv - o) 



Proof: Regard X as a repeated game of chance, so that if you bet a stake Cn on the n*^ 
game, your winnings will be C„(X„ — Xn-i) for that game. Choose a betting strategy C as 
follows: 

Wait until X gets below a. 

Place unit stakes until X gets above b. 

Wait (i.e. stop betting) until X gets below a 

again. 

Place unit stakes until X gets above b. 
etc. 

To describe C mathematically, note that if C„ = (i.e. no bets on the n*^ game), then 

Cn+l ■ 



if X„ > a 

1 if X„ < a 



Similarly, if Cn = 1, then 



+1 



OiiXn>b 
^liiXn<b 

It follows that we can define Cn inductively by 

Cl = I{Xo<a} 
Cn+l = I{Cn=0}I{Xn<a} + I{Cn = l}hXn<b} 



Martingale Theory 



39 



Since Cji+ 

1 is defined in terms of the J'n-measurable functions C„,X„, it follows that Cn is 

previsible. 

Let Yn be the total winnings until time n, i.e. 1^ = (C • X)n- 
Now note that the total winnings by time N must satisfy 

YNiu) >{b- a)UN{X; [a, b]){uj) - [XN{io) - a]" 

This is because every upcrossing contributes at least (6 — a) to the total winnings. The final 
term, [X^ ~ a]~ , takes into account that we may be placing bets on the last stretch to time 
N. It is clear that our losses on this stretch cannot exceed [Xj^ — a\~ . 

Since X is a supcrmartingalc and C is previsible and non-negative, the martingale trans- 
form C ■ X is also a supermartingalc. Thus 

EYn = E{C ■X)n< E(C ■X)o = 

from which the desired inequality follows immediately. 

H 

Corollary 2.3.4 Let X be a supermartingale bounded in £} , and let a < b E M.. Then 
{b - a)EUooiX; [a, b]) < \a\ + supE|X„| < oo 

n 

SO that 

P(^oo(X;[a,6])<oo) = l 

Proof: Since the Un are non-negative and increasing, the Monotone Convergence Theorem 
implies that {b — a)EUN {b — a)EUoo as AT — >■ oo. Now, using the triangle inequality, 

{b-a)\imEUN < supE[XAr - a]~ 

N 

< supE|Xjv| + \a\ 

N 

from which the required inequality follows. The second result is a trivial consequence of the 
first. 

H 

We can now prove the Martingale Convergence Theorem: 

Proof of Martingale Convergence Theorem: We want to prove that Xn{uj) converges 
almost sTircly to some finite limit Xrx,{^)- Now recall that liminf,, X„ and limsup^ X„ always 
exist, but that lim„X„ only exists if liminf „X„ = limsup„X„. For concreteness' sake, define 
Xoo{<^) = limsup„ X„(a;). Let 

A = {w : Xn{oj) does not converge to a limit in [—00,00]} 
= {uj : liminf X„ (a;) < limsup(a;)} 

" n 

= M {w : liminf X„(a;) < a < b < limsupX„(a;)} 

{a<6eQ} 

= U 

{a<6eQ} 



40 



Uniformly Integrable Martingales 



where the last equahty defines in the obvious way. Now if 

hminf X„(a;) < a <b < limsupX„(a;) 

" n 

then Xn{uj) must go below a infinitely many times, and also go above b infinitely many times. 
It follows that Aa^b ^ {'^ • UooiX] [a,b]){uj) = oo}. The preceding corollary assures us that 
¥{Uoo{X; [a, b]) = oo) = 0, however, and thus that P(Aa,6) = for every a < b e Q. Since A 
is just the countable union of the Aa.6, and since the countable union of sets of measure zero 
itself has measure zero, it follows that P(A) = 0, and thus that Xoo = lim„ Xn exists almost 
surely, though the limit may be ±oo. We must still prove, therefore, that X^o G C^. 
By Fatou's Lemma, we see that 

E(|Xoo|) =E(liminf |X„|) < liminfE|Xn| < supE|X„| < oo 

n n „ 

where the last inequality follows because we are assuming that the sequence Xn is bounded 
in jC^. Hence X^ e jC^, so that P(Xoo is finite) = 1, as required. 

H 

Example 2.3.5 Suppose that a gambler, starting with an initial fortune Xq G N, plays 
repeated rounds of a fair game. The gambler will play until he is ruined (if ever). Also 
assume that, while the gambler is playing, 1 unit is won or lost on each game. 

Let Xn be the gambler's fortune after n games, and let r = inf{n : X„ = 0} be the time 
of ruin. Then 

\Xn+i - Xn\ = 1 if n < r 

and 

|X„+i-X„|=0 if n > r 

Now Xn > for each n, and EX„ = Xq, because (X„)„ is a martingale. It follows that {Xn)n 
is £^-boundcd, and thus that there is G such that X„ — >■ a-s. Thus, almost surely, 
(Xn)n is a Cauchy sequence. 

Now let < £ < 1. Then, almost surely, there is iV G N such that n > N implies 
|X„+i — Xn\ < E. But then clearly |Xn+i — X„| = 0, so that t < N. Hence, almost surely, r 
is finite, i.e. the gambler will eventually be ruined, with probability 1. 

□ 

The Martingale Convergence Theorem states that any >C^-bounded martingale converges 
almost surely. However, we are frequently interested in other types of convergence, e.g. 
convergence. The next section deals with this. 

We end this section with one more useful convergence result: 

Proposition 2.3.6 Suppose that Xn — ^ Xqo in D'{Q., T , P) for some p G [1, oo). Let Q be 
a sub-a -algebra of F. Then E[X„|^] — > E[Xoo|^] in CP as well. 

Proof: By Jensen's inequality, 

E |E[Xoo|a] - E[X„|g]|^' < E|Xoo - X„|f ^ 

H 



Martingale Theory 



41 



2.4 Uniformly Integrable Martingales 

2.4.1 Uniform Integrability 

Work in a probability space (0, J", P). 



Definition 2.4.1 (1) A set {Xi : i G /} of random variables is said to be L^-bounded if 
supjgjE|Xj| < oo, i.e. if there is a K < oo such that < K for all i G /. 

(2) A set {Xi : i G /} of random variables is said to be uniformly P continuous (u-P-c for 
short) if and only if for every e > there is 5 > such that whenever P(F) < 5, then 

E[|X|;F]<£ iov&WXeX 

i.e. if and only if 

sup ^ as P(F) 

(3) A set {Xi : z G /} of random variables is said to be uniformly integrable (or UI) if 

lim supE[|Xi|; \Xi\ > K]=Q 

K^oo i^i 

i.e. if for every e > there is a K such that 

E[\Xi\;\Xi\> K]<e for alH G I 

We say that a discrete- or continuous-parameter stochastic process Xt is UI if and only 
if the collection of its component random variables {Xt}t is UI. 

Exercise 2.4.2 Suppose that X € £^(f2, J",P). We show that singleton {X} is both UI and u-P-c. 

(a) We first show that {X} is u-P-c. Suppose not. Explain why there is an g > and a sequence Fn € T such 
that 

P(F„) < 2"" but E[|X|;F„] > s 

(b) Now define F := {Fn, i.o.). Apply a Fatou Lemma to conclude that 

E[|X|;F] > limsupE[|X|;F„] > e 

n 

(c) Apply a Borel-Cantelli Lemma to show 

P(F) =P(i^„,i.o.) =0 

and explain why we have obtained a contradiction. This proves that {X} is u-P-c. 

(d) Next, we show that {X} is UL Let e > 0, and choose 6 > such that E[|X|; F] < e whenever P(F) < 5. 
Why can we do this? 

(e) Show that 

P(|X| > K)< ^E\X\ < oo 

(f) Take K > and show E[|X|; \X\ > K] < s, as required. 

□ 

Exercises 2.4.3 (a) {Xi -. i G 1} is UI if and only if {\Xi\ : i € /} is UI. 



42 



Uniformly Integrable Martingales 



(b) Every UI family is bounded in . 

(c) Not every £^-bounded family is UI: Let (SI, T, P) be the usual probability space on the unit interval [0, 1], 
together with Lebesgue measure. Let 

Fr, = [0, 1/n] Xr, = n/F„ 

for n G N. Show that {X„ : n G N} is bounded in , but that {X„ : i G N} is not UL 

(d) Suppose that {Xi : i G /} is a family of random variables dominated by some Y £ (i.e. \Xi\ < Y a.s. 
for all i G /). Show that X is UL 

(e) Show that any finite family of integrable random variables is UL 

(f ) Show that iiX,y are two UI families, then the famihes Xuy and X + y = {X + Y : X e X,Y e y} are 
also UI. 

(g) Show that if A" is a family of identically distributed integrable random variables, then X is UI. 

□ 



Theorem 2.4.4 {Xi : i ^ 1} is U I iff it is -bounded and u-P-c. 



Proof: {=>)'■ Suppose {Xi : i G /} is L^-bounded and u-P-c. Choose M < oo such that 
E|Xj| < M for all i G I (by L^-boundedness). For e > 0, choose a (5 > as in the definition 
of uniform-P-continuity. Observe that ^P{\Xi\ > K) < E[|Xi|; \Xi\ > K] < M, so that if 
K > M/6, then P(|Xi| > K) < 5), from which E[|Xi|; \Xi\ > K]<e. Since the definition of 
K does not depend on i G /, we see that {Xi : i G /} is UI. 

(<;=): Now suppose that {Xi : z G /} is UI. For e > 0, choose K such that E[|Xj|; \Xi\ > 
K] < e/2 iov all i G /. Observe first that 

E\Xi\ = E[\Xi\; \Xi\ < K]+E[\Xi\; \Xi\ > K] < K + ^ for alH G I 



which proves that {Xi : i G /} is L^-bounded. Now if F is a measurable set with F{F) < 



£ 

we have 



E[\X,\-F]=E[\X,\-Fr\{\X,\ <K]]+E[\Xi\-Fr\{\Xi\ > K]] <K¥{F)+£/2<e 
so that {Xi : i G /} is also u-P-c. 



Our main result is the followinG: -convergence is precisely the intersection of uniform 
integrability with convergence in probability. 



Theorem 2.4.5 // Xn,X G , then Xn ^ X if and only if {Xn : n £ N} is UI and 



j^i 1 F 

Proof: (=^>): We know that if Xn — ^ X, then {X„ : n G N} is L^-bounded and X„ — > X. 
It therefore remains to show that {Xn '■ n G N} is u-P-c. So let e > 0. Choose G N so 
that n>N implies ||X„ - X\\i = E\Xn - X\ < e/2. Choose 5 > so that P(F) < 8 implies 



E[|X|; F] < e/2 (cf Exercise 2.4.2), and decrease 5 if necessary so that also P(F) < 5) implies 



E[|X„|; F] < e/2 for n = 1, . . . , TV - 1. Then if n > we also have 

E[|X„|;F] <E[|X„-X|;F]+E[|X|;F] <e/2 + e/2 



Martingale Theory 



43 



which shows that {X„ : n e N} is u-P-c. 

(<^): Recah that we proved the following fact about convergence in probability: Suppose 

that f : M"*" — )• is a function which is increasing, and strictly increasing on some interval 

P 

(0,a), is bounded, continuous, and satisfies /(O) = 0. Then Xn — ?• X iffKf\Xn — X\ ^ 0. 

Suppose now that is UI, and that Xn —5- X. Fix e > 0. Note that {Xn — X : n ^ N} is 

UI also (cf. Exercises [2~4.2|2.4.3| (f)). First, pick K such that E[|X„-X|; > K] < e/2 



for all n G N. Now set f{x) := AK, and note that / has the properties required to determine 
convergence in probability. Hence IE[|X„ — X\ AK] ^ 0. Choose N gN sufficiently large that 
E[X„ — X\ A K] < e/2 whenenever n > N. Then for n > N, we have 

\\Xn-X\\i=E[\Xn-X\]=E[\Xn-X\;\Xn-X\ <K]+E[Xn-X\;\Xn-X\>K] 
<E[\Xn - X\ A K] +E[Xn - X\;\Xn - X\ > iT] < § + § 

so that \\Xn — X\\i ^ 0, as required. 



We continue with a few more results about uniform integrability that will come in handy: 
We have already seen that every UI family is bounded in L^, but that not every L^— 



bounded family is UI (by Exercise 2.4.3). Nevertheless, being UI is only just stronger than 



being bounded in L^, as the following proposition makes clear: 



Proposition 2.4.6 If p > 1 and {Xi : i G 1} is a family of random variables which is 
bounded in U', then {Xi}i is UI. 

Exercise 2.4.7 We prove the preceding Proposition: 

(a) Explain why there is S G M such that E|Xj|P < B for all i £ I. 

(b) Let K >0. Show that if x > i^, then x = x^'PxP < K^-^x^ 

(c) Deduce that 

E[\X^\■, \Xi\ > K]< K'^-PE[\Xi\P; \Xi\ > K] < K^-^B 

for all is/. 

(d) Now given e > 0, choose K sufficiently large so that K^^^B < e. Show that we will have 
E[|Xi|; \Xi\ > K] < e for all i G /, so that {Xi : i G /} is UI, as claimed. 

□ 

Here is an important source of UI martingales: 



Theorem 2.4.8 Suppose that X £ C'^{fl,T,F), and that {Ti : i £ 1} is a family of\ 
sub-a -algebras of T . Then the set 



E{X\H -i^l] 



is UI. 



The proof is an exercise: 



44 



Uniformly Integrable Martingales 



Exercise 2.4.9 Prove the preceding proposition. 

[Hint: Let e > 0, and choose 5 > such that P(F) < S ^ E(|X|;F) < e. (Why does 6 exist?) Choose K 
such that A'^^El^l < S. li C let Xi := E[X\Ti], and use Jensen's inequality combined with Markov's 
inequality to show that 

Kr(\X^\ > K)< E\X\ 
Deduce from the definition of conditional expectation that E[|Jfi|; \Xi\ > K] < e.] 

□ 

Remarks 2.4.10 The importance of the notion of uniform integrability becomes clear when we consider 
topology: Uniform integrability is equivalent to relative sequential compactness in equipped with its weak 
(i.e. (j(Li,Loo)-) topology. We will not need this fact, so do not prove it here. 

□ 



2.4.2 UI Martingales 



Theorem 2.4.11 (a) Suppose that X is a supermartingale. Then X is UI if and only if 
there is a random variable Xoo such that Xn — ?• -'^oo o-s- and in C} . We then have 

Xn>E[X^\Tn]. 

(h) Moreover, if X is a U I martingale, then 

X„ = E[Xoo|-F„] a.s. 



Proof: (a) Suppose that (X„)„ is a UI supermartingale. Then X^ is bounded in C^, and 
thus there is a random variable X^o such that Xn — X^^ a.s., by the Martingale Convergence 
Theorem. But then Xn — t- X^o in probability. It follows that X„ — )■ X^ in by Theorem 

EXSl 

Conversely, if Xn — )• in 

L\ then (X„)„ is UI, again by Theorem 2.4.5 
We now show that E[Xoo|-7>i] = Xn a.s. Suppose that F G Fn-, then 

IE[X„; F] > E[X„; F] for ah m > n 

(This is just the supermartingale property.) But 

\E[X„,;F]-E[X^;F]\ <E[\Xn,-X^\;F]^0 as m ^ oo 

because Xm — >■ Xoo hi C^. Thus, letting m — oo, we get 

E[Xm;F]^E[Xoo;F] 

But for m > n we have E[Xm; F] < E[X„; F], and thus E[Xoo; F] <E[Xn;F], as required. 

(b) follows from the observation that if X is a martingale, then we can replace the in- 
equality signs by = in the above. 



Remarks 2.4.12 We have seen in Theorem 2.4.8 that applying conditional expectations 



to an integrable random variable produces uniformly integrability. By Theorems 2.4.8 and 



2.4.11 it follows straight away that all UI martingales are obtained by applying conditional 



expectations: 



Martingale Theory 



45 



A martingale M is UI if and only if there is a random variable such 
that Mn Moo a-S. and in C^, and 

M„ = E[Moo|^„] 



□ 



2.4.3 Optional Sampling of UI Supermartingales 

Let Mn be a UI martingale with respect to some filtration and let Too = (^([Jn-^n) 



Theorem 2.4.13 (Doob's Optional Sampling Theorem) Let < a < t < oo be stopping 
times, and suppose that M is a UI martingale. Then 



E[Mr\Ta] = M^ a.s. 



Proof: We know that there is M^o such that M„ — t- M^q a.s. and in and that M„ = 
E[Moo|J>i] for all n. To prove the theorem, it suffices to show that E[Moo|J>] = Mr, for then 

E[Mr\T^] = E[Moo|JV|^a] = E[Moo|.7v] = M^ 

Now If F G JV, then F = \J^F Ci {t = n}, and each F n {r = n} G J='n- Hence 

E[Moo; F] = ^ E[Moo; F n {r = n} 

n 

= ^E[M„;Fn{T = n} 

n 

= ^E[M^;Fn{T = n}] 

n 

= E[Mr;F] 

H 

We can now give another characterization of UI martingales. Let M be an adapted process. 
We define Moo by 



lim Mn(oj) if this limit exists 

n->-oo 



otherwise 



Note that if M„ -)• X a.s., then X = M^o a.s. 



Theorem 2.4.14 Suppose that M is an adapted process. Then the following are equiva- 
lent: 

(a) M is a\Jl martingale; 

(b) There is c G M such that for every stopping time r < oo, we have 

E\\Mr\] < oo and E[M^] = c 



46 



Upwards and Downwards 



Proof: (a) =^ (b): If M is a UI martingale, then we know that Mn — >■ a.s. and in 
By applying the Optional Sampling Theorem to the stopping times < r, we have 
E[M^] = EMq = c for all stopping times r. Moreover, if we apply the Optional Sampling 
Theorem to the stopping times r < oo, we see that Mr = E[Moo|-7v]- Now Moo e £^ and 
E[|M^|] < E[|Moo|] < oo, by Jensen's inequality. Thus Mr G for every stopping time r. 
(b) (a): Note that every constant time r = n is a stopping time, and thus we have M„ G 
with E[M„] = c for all n < oo. (Here Mqo is defined in the way described just before the 
statement of the theorem.) This suggests a martingale property (but is still a long way from 
proving it). Now let F £ J^n, and define the random time r by 



n if a; G -F 
oo if a; G F'' 



It is clear that r is a stopping time, and thus 

c = E[Mr] = E[M„; F] + E[M^;F''] 
c = E[Moo] = E[Moo; F] + E[M^; F''] 

which clearly implies that E[M„; F] = E[Moo; F]. It follows that 

Mn=E[M^\Tn] 



and thus that M is a UI martingale, by Theorem 2.4.8 



2.5 Upwards and Downwards 

Theorem 2.5.1 (Levy's Upward Theorem) 

Suppose thatC, G C^{Q,J^,¥), and that Fn is a filtration, Fn ^ F . Define F^o = crdJn-^n)' 
and define M^ = E[(^|J-'„] for n < oo. Then M is a\]l martingale and Mn — > M^o a.s. 
and in C} . 

Proof: That M is a martingale follows trivially from the Tower Property, and that M is UI 



follows from Theorem 2.4.8 Hence there is r/ such that M„ — )■ r\ a.s. and in so we must 



just show that r] = Moo a.s. But if F G Fn, then 

E[C; F] = E[M„; F] for all m > n 

by definition of conditional expectation. Since Mm — t- in we must have E[t/; F] = E[(^; F\ 
for all F G Fn- Since this is true for all n, we have 

E[?7; F] = E[C; F] for ah F G |J Fn 

n 

But Un-^" ^ vr-system that generates Foo, and thus r/ = E[C|Foo] = M^o a.s. 

□ 



Martingale Theory 



47 



There is also a downwards version of the preceding theorem, obtained by going back- 
wards in time, which will play an important part in the continuous-parameter theory. This 
necessitates the introduction of reversed martingales. 

Note that we can define the notion of martingale w.r.t. any partially ordered index set 
(P, <) as follows: A P-filtration is a set of P-indexed cr-algebras satisfying 

J^p ^ J^q whenever p < q in P 

and an adapted P-indexed family of integrable random variables Xp is called a P-supermartingale 
if and only if 

E[Xg|Jp] < Xp whenever p < g in P 

Note that this definition makes sense even if P is not a total ordering. 

Now consider the set N of non-negative integers together with the reverse ordering ;^ 
defined by 

n ^ m iff m < n 

A (N, ;:^)-supermartingale is called a reversed supermartingale. Thus if X„ is a reversed su- 
permartingale with respect to a filtration T„, then each Xn is integrable and J>i-measurable, 
and 

J^i D J"2 2 -^3 • • • and E[Xm|-7>i] < Xn ii m < n 
In particular, E[X„|J>i+i] < Xn+i and E[Xo|J>i] < Xn 



Theorem 2.5.2 (Levy-Doob Downward Theorem) 

Let {JFn : n G N) he a decreasing sequence of a -algebras, i.e. 

and define Foo = flnsN*^" ^ ~ i-^n ■ n E N) be a reversed supermartingale w.r.t. 
{J^n n so that 

E[Xn\J^m] < Xm forn<m 

Finally, assume that the family Xn has lim„ EX„ < oo . 
Then Xn is UI, and the limit 

Xoo = lim Xn 

n—i-oo 

exists a.s. and in . Moreover, we have 

E[X„|J"oo] < ^oo a.s. 
with equality if Xn is a reversed martingale. 



Proof: The existence of the a.s. limit X-^o = lim Xn follows from the Upcrossing Lemma, 

just as for the ordinary Supermartingale Convergence Theorem: Note that if Xq, . . . , Xj^ is a 
reversed supermartingale, then the reversed sequence Yn = X^-n is an ordinary supermartin- 
gale. By the Upcrossing Lemma, we therefore have Un{Y, [a, 6]) < E[(YAr — «)"]• Clearly Y 
and X have the same number of upcrossings, however, and thus Um{X\ [a, 6]) < E[(Xo — a)~\. 
From here, it is straightforward to prove the a.s. convergence of Xn- 



48 



Upwards and Downwards 



Once we've proved that X is UI, it will follow from Theorem 2.4.5 that X„ converges in 



as well. A quick perusal of the proof of Theorem 2.4.11 ought to convince you that in that 
case we also have E[Xm|-7^oo] < ^oo a.s., with equality if X is a martingale. 

Our task is therefore to prove that X is UI. Let e > 0, and note that (EX„)„ is an increas- 
ing bounded sequence (by hypothesis). Since a monotone bounded sequence is convergent, it 
is Cauchy, and thus we may choose N £N such that n > N implies EX„ — EXjv < | . Then 
for K > 0, we have 

E[\Xn\ : \Xn\ >K] = -E[Xn; Xn < -K] + E[X„] - E[X„; X„ < K] 

< -nXN^Xn <K]+ E[Xn] + I - E[X^; Xn < K] 
= n-XN]Xn < -K] + E[X,v; Xn>K] + ^- 

<n\XN\\\Xn\>K] + ^- 

But > K) — as n — 7- oo: For let L =t lim„ EX„ < oo. By Jensen's inequality, X~ is 

a reversed su6martingale (because x~ = max{— x,0} is clearly convex and decreasing). Thus 
if X > 0, we have 

KW{\Xn\ > K) < E\Xn\ = EXn + 2EX„ <L + 2EXq 

It follows that sup„P(|X„| > i^) — ^ as — ^ oo. 

Hence by picking a sufficiently large K, we can ensure that E[|X„|; |X„| > K] < e for all 
n > N. If necessary, we can enlarge K even more to ensure that E[|X„|; \ Xn\ > K] < e for 
all n < as well. 

This proves that X is UI, as required. 



It may be useful to put the upwards and downwards theorems together. Let t S mean 
is increasing and G = o"(|J„-^n)"- Similarly, Tn i G abbreviates "J>i is decreasing, and 



Theorem 2.5.3 Suppose that Tn is a sequence of a -algebras and that C is an integrable 
random variable. Define Xn = E[C|J>j]. 

(a) If J-n t Gj then Xn — )■ E[C|t/] a.s. and in C} . Moreover, Xn is a UI martingale. 

(b) If J-n i Q, then Xn E[C,\Q\ a.s. and in C} . Moreover, Xn is a UI reversed martin- 
gale. 



With the Levy-Doob results, it is now possible to prove a stronger version of the Lebesgue 
Dominated Convergence Theorem. First note the following useful fact: 



Proposition 2.5.4 (a) IfTn^Q and Xn ^ X in C} , then E[A„| ^ E\X\Q\ in C} . 
(b) IfFn IG andXn-^X in , then E[A„|J'„] E[X\g] in . 



Martingale Theory 



49 



Proof: To prove (a), we must show that ||K[X„|J^„] -E[X|g]||i = E|E[X„|J"„] -E[X|g]| 
as ra — > oo. Now by the triangle inequaUty, 

\\E[Xn\Tn]-nx\g]\\i < \\E[Xr,\Tn]-nx\Tn]\\i + | |E[X| - E[X|g] | |i 

< \\Xn - x||i + \\E[x\Tn] - nx\g]\\i 

But \\Xn - X\\i ^ by hypothesis, and ||E[X|J"n] - E[X|g]||i ^ by Levy's Upward 
Theorem. 

(b) is proved in the same way, this time invoking the Downward Theorem. 

H 

Theorem 2.5.5 (Strong Dominated Convergence Theorem) 

Let Tn, Q he a-algebras, and let Xn, X, Z be integrable random variables. Assume further 
that Xn — >■ X a.s., and that each < Z a.s. Then 

(a) IfFn t Q, then E[Xn\Fn] ^ E[X\g] a.s. and in O 

(b) IJ7n i G, then E[X„|J^„] E[X\g] a.s. and in O. 

Proof: By the Lebesgue Dominated Convergence Theorem, we have X„ — > X in >C^, and 
thus the £^-convergence of conditional expectations follows from the previous proposition. 

We need therefore only show that a.s. convergence holds. 

We only prove (a), the proof of (b) being very similar. 
Define Wn = sup;i.>„ \Xk — X\. Then Wn G C^, because Wn < 2Z. Since Xn X a.s., we also 
must have W„ | a.s. Now fix > 1. If n > A^, then |X„ — X\ < Wn < Wn, which implies 
E[\Xn - X\ l^n] < EIWnIJ'u]- By the Upward Theorem, we must have E[W"iv|J>i] ^ E[Wjv|^] 
as n — >■ oo, a.s. and in C^. It follows that 

limsupE[|X„ - X\\Tn] < limE[WN\Tn] = E[Wjv|a] 

Now let — >^ oo. By the Lebesgue Dominated Convergence Theorem for conditional expec- 
tations, since Wn < 2Z and Wn i a.s., we also have E[M^7v|^] i a.s. It follows that 
limsup„E[|X„ — X||J>i] = a.s., and thus 

|E[X„|J"„] -E[X|J"„]| < E[\Xn - X\\J='n] a.s. as n ^ oo 

By the Upward Theorem once more, we see that E[X|J>j] — >■ E[X|^] a.s. as n — >■ oo, and so 

E[Xn\Tn] = (E[X„|Jn] " E[X| + E[X| ^ E[X\g] a.s. 

as n — >■ oo. 

H 

2.6 Martingale Inequalities 

The aim of this section is to state and prove two inequalities due to Doob. 



50 



Martingale Inequalities 



Theorem 2.6.1 (Doob's Maximal Inequality) 

Let X be a non-negative submartingale. Then for c > in M and n we have 
cF{supXk >c)< E[Xn;supXk > c] < E[X„] 

k<n k<n 



Proof: Let F = {supX^ > c}. Further inductively define a sequence Fk,k < n hy 

k<n 

Fo = {Xo > c} 

Fk+i = {Xo < c} n {Xi < c} n • • • n {Xk < c} n {Xk+i > c} 

Then F is the disjoint union F = FqU ■ ■ - U F^.. Because X is adapted, we have F^ G J-'k, and 
Xk > c on Fk- Because X is a submartingale, we therefore have 

cF{Fk)<E[Xk;Fk]<E[Xn;Fk] 

Summing over k, we obtain 

cF{F)<E[Xn;F] 

as required. 

H 

Note that if M is a martingale, then is a non-negative submartingale. This follows 
easily from an application of Jensen's inequality. We thus have: 

F(supMfc>c)<^E[M2] 

k<n C 

Before we prove the next theorem, recall Holder's inequality: Suppose that 1 < p < oo, 
and that q is such that i + i = 1. If X e CP and Y e then XY G C^, and 



l^^lli < ll^llpll^llq 



We need a lemma: 



Lemma 2.6.2 Suppose that X, Y are non-negative random variables such that 

cF{X >c)<E[Y;X>c] for every c> 
^ ' ^ = 1, then 

X\\p<q\\Y\ 



Ifp > and if l + l = then 



Proof: Define 



POO 

/i = / p(f-^F{X > c) dc 
Jo 

72 = / p(f-^E[Y; X>c]dc 
Jo 



Martingale Theory 



51 



Then clearly Ii < l2- Using Tonelli's Theoren:j^ we change the order of integration: 



\Jn 

X{co) 

pcP-^ dc ¥{duj) 

n Jc=o 




n 

E[XP] 



Similarly, 



^ I ^Ic ^ ^ d<^ Y F{duj) 

Now use Holder's inequality to conclude that 

E[XP] < E[qXP-^Y] < q\\XP-^\\g\\Y\\p 

So far, we have not imposed any integrability conditions on X, Y. The lemma is obviously 
true if \\Y\\p = oo. Suppose now that \ \Y\\p < oo. Then since {p — l)q = p, we have 

\\XP-% = E[XPf^ 

and thus 

\\X\\p<q\\Y\\p 

If ||l^||p < oo, but ll^llp = oo, replace X by X A n in the above. Note that the hypothesis of 
the lemma is still true, i.e. 

cF{X > c) < E[Y; X > c] cF{X A n > c) < E[Y; X > c] 

and certainly ||X A n||p < oo. The result follows upon application of the Monotone Conver- 
gence Theorem as n — >■ oo. 



Theorem 2.6.3 (Doob's £P-Inequality) 



suhmartingale hounded in CP , and define 



Suppose that p > 1 and that q is defined so that ^ + ^ = 1- Let X be a non-negative 



X* = sup Xk 

ken 

Then X* £ CP, and 

\\X*\\p < qsup\\Xk\\p 

fceN 

The suhmartingale X is therefore dominated hy X* G CP 



^i.e. Fubini's Theorem with non-negative integrands. 



52 



Continuous-Parameter Martingales 



Proof: Define X* = supX^. Then X* t X* a.s. By Doob's maximal inequality, we have 

k<n 

cP(X* > c) < E[X„;X* > c] for every c > 0. By the preceding lemma, we therefore have 

ll-'^nllp — ^ll-'^nllp < 9Sup||Xfe||p for all n. Now apply the Monotone Convergence Theorem 

ken 

as n — )■ oo to obtain < gsup||Xfe||p. 

feeN 

H 

We shall usually apply the >C^-inequality for the case p = 2. 



2.7 Continuous— Parameter Martingales 
2.7.1 Stopping Times 

Many of the concepts and results for discrete-parameter martingales can be extended to 

continuous-parameter martingales, and we shall spend some time doing so. From now on, we 
shall assume that all martingales are at least cddldg, and that all filtrations satisfy the usual 
conditions. 

First we define the notion of a stopping time in an obvious way: 



Definition 2.7.1 A random variable r : $7 — > M"*" U {oo} is called a stopping time (w.r.t. 
a filtration J^j, i > 0) if and only if 

{co : t{(jj) <t}eJ-'t for each t>0 

If X is a stochastic process, then we define the stopped variable X^- by X^(aj) = X^(^^)(a;). 
The cr-algebra of events prior to r, denoted Jv and also called the pre^ algebra, consist 
of all those events A with the property that 

Ar\{T <t} ^Tt for alH > 



In the discrete framework, we saw that we could replace {t < t} € J-t by {r = t} G J-t 
in the definition of a stopping time. However, this does not work in the continuous case: If 
r is a continuous random variable, then we clearly have P(r = = 0, so {r = G J"o ^ 
always. 

The result of the following exercise is often useful: 

Exercise 2.7.2 One frequently also encounters the notion of an optional time. If Ft is a 
filtration which does not necessarily satisfy the usual conditions and if r is a random variable, 
we say that r is an optional time if and only if {r < t} G Ft for all t > 0. 

(i) Show that any stopping time is an optional time. 

(ii) Show that if Ft satisfies the usual conditions, then any optional time is also a stopping 
time. 

[Hint: (i) {T<t} = U Jr < t - ^}; (ii) {t <t} = njr < t + 1} g J", whenever s > t] 



□ 



Martingale Theory 



53 



The next theorem lumps together all the basic results about stopping times: 

Theorem 2.7.3 Let a,T,Tn be stopping times with respect to a filtration Ft satisfying the 
usual conditions. 

(a) sup„r„,inf„r„,limsup„r„,liminf„r„,lim„T„ are all stopping times. 

(b) T A a,T y a and t + a are stopping times. 

(c) The pre-T a-algebra J> is a a-algebra which contains all the null sets, and r is Jv- 
measurable. 

(d) If t{ijj) = t for all u, then Fr = Ft- 

(e) If cr <T, then Fa Q Fr- 

(f) A random variable X is Fr-measurable if and only if Xl^^^t} Ft-measurable for 
each t>0. 

(g) If Fa, then Ar\ {u < t}, Ar\ {a < t} e Fahr- 

(h) Fr^a =Fr^ Fa- 
ll) Each of the events {r < a}, {a < r}, {r < a}, {a < r}, {r = a} belongs to Ft n Fa- 
(j) If Tn i T, then Ft„ i Ft- 

(k) I/t-AI^oc ts a map such lhal t = t a.s.. then t is a slopj)iiig lime, and Fy = Fr- 

Exercise 2.7.4 (1.) Prove the preceding Theorem. 
[Hints: 

(a) {sup„r„ < t} = flriiTra < i}; {inf„r„ < t} = Uni'^n < ^.nd use Exercise 2.1.11. 

(b) r A(T = inf{r, cj}; {r + cj < t} = U {T<q,a<r}. 

qj-eQ+,q+r<t 

(c) AT]{t <t} = {An{T < t}y n {r < t}; If iV G J" is a null set, then iV n {r < is 
a null set, and thus in Ft; {t < s} (1 {t < t} = {t < s At} £ Ft for all s, t. 

(d) For each s, Af] {t < s} is either or A, depending on whether s < t or s > t. 

(e) If cr < T, then An {t < t} = An {a < t} Cl {t < t}. 

(f) =^>: Follow the usual procedure from indicator functions to simple to non-negative 
measurable functions etc.; <^=: Assume that XI^T<t} is J^t-measurable for all t. Note 
that {X <r}n{T <t} = {XI{r<t} < r} n {t <t} e Ft- 

(g) Note that if r is a stopping time, then r A t is J^^-measurable, because {t A t > 
s} = {t > s} n{t > s}. Next note that [^1 n {a < r}] n {r < t} = [A n {a < 
t}]n {t < t} n {a At < T A t}, and use the just proven fact that both a At, t At are 
.Ft-measurable. The result with < instead of < is proven similarly. 

(h) If ^4 G n Jv, then note that 

An {a At <t} = {An {u < t} n {a <t})li {An {t < a) n {t < t}) 
and use (g). 



54 



Continuous-Parameter Martingales 



(i) By (g) and (h). 

(j) Note that An{T<t} = \J^{A n {r^ < t}). 

(k) f is measurable, since t~^{B) differs from t~^{B) by a null set. ] 

(2.) Prove that if r„ t then we do not necessarily have = c(Un*^T-n)- 

[Hint: Let Xt = for < t < 1, let Xi be a Bernoulli variable with F{Xi = 1) = i = 
P(Xi) = —1, and let Xt = Xi for Xt > 1. Let J^t be the natural augmented filtration, 
and let = 1 - ] 

(3.) Show that if r„ t ^-nd Uni'^" = t} = then Jv^ t Jv. 

[Hint: If ^ G JV, then yln{T < r„} = An{T = Tn} G Jv„ by (g). Thus yl = [Jni^'^i^ = 
U)G^(Un-^rJ. ] 

□ 

The next proposition will be very useful in transferring results from discrete-time to 
continuous-time. It states that any stopping time can be approximated (from above) by 

stopping times that take only countably many values. Recall that [x] denoted the greatest 
integer less than or equal to x, i.e. [x] = supZ n {—oo,x]. Also define [oo] = oo. 



Proposition 2.7.5 (Discretization Lemma) Let t be a stopping time. For each integer 
n> 1, define 

rnH = ^ 

Then each Tn is a stopping time, with Tn ii t (pointwise). 



Proof: Note that x„(cj) = ^ whenever A:/2" < t{uj) <{k + l)/2", so that t„(cj) > t{oj) for 
each u; G 17. So define D+ = {fc/2" : = 0, 1, 2, . . . } (the set of non-negative dyadic rationals 
of order < n), and for each n, define maps a„, hn ■ by 

Unit) = max{(i eB+ :d<t} hn{t) = mm{d eB+ ■.d>t} 

so that an{t) < t < hn{t) for each n. Since C D^^.^, we see that an{t) t t and 6„(t) |4- t 
(as n oo). Moreover, r„(c<;) = 6„(r(a;)), by definition, and each D+ is countable. Hence 
the range of r„ is countable. Now note that r„(ci;) < t if and only if t{uj) < a„(t), so that 

{Tn <t} = {T< anit)} G J^a„{t) ^ 

This proves that each r„ is a stopping time. 

H 

Exercise 2.7.6 If we try to approximate r by stopping times Tn't t from below by putting 
Tn{co) = an(r(a;)) (where is defined as in the proof of the Discretization Lemma), then we 
run into trouble. Why? 



□ 



Martingale Theory 



55 



In the discrete context, we proved in Exercise 2.2.8 that if r is a stopping time and X an 
adapted stochastic process, then Xr is Jv-measurable. However, this proof depends on the 
random variable Xr taking only countably many values. For example, if r is integer- valued, 
then using X:^^{B) = [Jn^n^i^) H {r = n}, we see that X:^^{B) £ Tr for any Borel set 
B. This argument will obviously not work in continuous-time. However, assuming that the 
process Xi is right-continuous, we can use the Discretization Lemma to prove that Xr is 
J>-measurable in the continuous-time case as well: 

Choose stopping times r„ ^ r as in the Discretization Lemma, so that each r„ has a countable 
range. It is then easy to prove that each Xr^ is Jv„ -measurable, as in the discrete-time 



case. By Theorem 2.7.3 j), Jv„ I Tr- Since limsup^X^-^ = limsup„>^ X^-^, it follows that 
limsup^Xr^ is J>jy -measurable for each and thus that it is Jv-measurable. But since 
Xt is right-continuous a.s., we see that limsup„XT-„ = Xr a.s. Thus, using the fact that Jv 
contains all the P-null sets, it follows that Xr is Jv-measurable. 

We have thus shown: 



Proposition 2.7.7 Let (X()o<t<oo adapted process with a.s. right-continuous sam- 

ple paths. If T is a stopping time, then Xr is Tr-measurahle. 



2.7.2 Martingales in Continuous Time 



In this section, we will generalize the main discrete-parameter martingale results from Chap- 
ter 1 to the continuous-parameter cadlag case. Throughout, all filtrations are assumed to 
satisfy the usual conditions. Our first aim is to prove that /^^-bounded sub/supermartingales 
converge. The fact that is dense in will be very important. Typically, Q will be 
written as a countable union of an increasing family of finite sets. Restricted to each of 
these finite sets, the stochastic processes look like discrete-parameter processes, and all the 
discrete-parameter results will hold. Creative use of the monotonicity properties of measure 
and the integration theorems will allow us to extend the results to stochastic processes indexed 
by Q"*". Finally, right-continuity will be used to extend the results to stochastic processes 
indexed by M^. Since right-continuity involves approximation from above, the Levy-Doob 
results on reversed martingales will be important. 

We begin by defining the notion of the number of upcrossings, which, as in the discrete- 
parameter case, will come in very handy. 



56 



Continuous-Parameter Martingales 



Definition 2.7.8 (a) Suppose that Xt,t > is a real-valued adapted stochastic process, 

and let F be a finite subset of R"*" For a < 6 € R, define Uf{X; [a, b]) to be the number 
of upcrossings of [a, b] in F. To be precise, define a double sequence of stopping times 
Tk,crk recursively by 

Ti{io) = mm{t e F : Xt{u}) < a} 
(Jj{uj) = min{t e F -.ty Tj{uj), Xt{uj) > b] 
Tj-\-i{ijj) = min{i G F : t > aj{uj), Xt{uj) < a} 

We use here the convention that min(0) = +oo. Then Uf{X; [a,b]){uj) is defined to 
be the largest integer n for which (T„(a;) < oo. 

(b) If / C M+ is not necessarily finite, we define 

Ui{X; [a, b]) = su-p{Uf{X; [a, b]) : F a finite C 1} 



As in the discrete-parameter case, we have: 



Theorem 2.7.9 (Upcrossing Lemma) 

Suppose that X is a cddldg supermartingale, let [5, T] be a subinterval of R.'^ , and let 
a<beR. Then 



Proof: We can prove this directly from the discrete-parameter version of the Upcrossing 
Lemma. Let F„ be an increasing sequence of finite subsets of [S, T] with the following prop- 
erties: 



(a) S,T eFn for all n. 

(b) \JnFn = [S,T]nQ 

Now note that since x is cadlag, we must have U[s,t] = ^s,T]nQ =t li™n Up^, and thus that 
^[S,T] = liiUnlEJ/F^, by the Monotone Convergence Theorem. But {Xt : t G F„) is a discrete- 
parameter supermartingale (for each n), and T G F„. Thus by the discrete version of the 
Upcrossing Lemma, we have {b — a)KUFn < E[(Xt — cl)~]- Taking limits yields the result. 



We now prove the respective martingale convergence theorems in one fell swoop. Recall 
that the notion of uniform integrability was defined for arbitrary collections of random vari- 
ables, and that we do not need to redefine it for continuous— parameter processes. The same 
goes for the notion of uniform ¥ -continuity. 



Martingale Theory 



57 



Theorem 2.7.10 (Doob's Martingale Convergence Theorem) 

(a) Let X he a cddldg supermartingale bounded in C} . Then there is a random variable 
Xoo such that 

Xt Xoo a.s. 

and E|Xoo| < oo. 

(b) Moreover, if X is UI, then Xt — )• Xoo in £^ and then E[Xoo|-Ft] < Xt. 

(c) Finally, if X is a martingale, then Xt — t- Xqo in C} if and only if Xt is UI, and then 
Xt = E[Xoo\J't] in (b). 



Proof: (a) Define C = supjE|Xf|, so that C < oo (by assumption). Further define: 
X^{ijj) = hmsupXt(a;) X^{u}) = Hminf Xt(a;) 

Now if Hm Xt{u}) does not exist, then we can find real numbers a, b such that ^^^(a;) < a < 
b < X^{uj), and thus U]g_+{X{u}); [a,b]) = oo. However, 

Ef/[o,](X(a;); [a, 6]) < ^I^^ < £±M < oo 

u a u a 



Letting n ^ oo, we see that 



E[/]R+(X;[a,6]) < < oo 



by the Monotone Convergence Theorem. It follows that the set {oj : U^+{X{uj); [a,b]) = oo} 
has measure zero, i.e. Xao{ijj) = lim Xt{oj) exists a.s. Moreover, by Fatou's Lemma, E|Xoo| < 

liminftE|Xt| < C < oo. 

To prove (b), note that if X is UI and convergent in probability, then it is £^ -convergent, 
just as in the discrete case|^ Since we also have Xt = E[X„|J^t] whenever n > t, and since 
Xn > E[Xoo|-7>i] by the discrete-parameter result, we see upon application of a conditional 
expectation with respect to J-j that 

Xt>E[Xoo\Tt] 

as required. 

Finally, we can prove (c) as follows: By (b) we know that if X is UI, then Xt — )■ X^o in 
C^, because every martingale is a supermartingale, and the same argument will prove that 
Xt = E[X^\Tt]- Suppose now that Xt X^o in . Then clearly also E[Xt|a] ^ E[X^\g] 
in C^, for every ^ C It follows that Xs = 'E[Xt\Fs\ — )• E[Xoo|-^s] in as t — )• oo, and thus 
that each Xs = E[X^\Ts\. But then Xt is UI, by Theorem [2X8} 



''As an exercise, check that the proof (c) (d) =^> (a) in Theorem 2.4.5 also holds for continuous-parameter 
processes. However, if you peruse the proof of (b) (c), you will note that a discrete-parameter process is 
essential here. In fact, it is not true in general that an £^-convergent supermartingale X is UI in continuous- 
time, although this is the case if X is a martingale. 



58 



Continuous-Parameter Martingales 



It follows that a continuous-parameter martingale Mt is UI precisely if it is of the form 
Mt = E[Afoo|^t] for some integrable random variable Moo- AH UI martingales are of this 
form. 

Most of our results deal with right-continuous (super-, sub~)martingales. However, results 
such as the Upcrossing Lemma do actually imply some fairly strong continuity conditions. We 
shall show that if Xt is a submartingale (w.r.t. some filtration satisfying the usual conditions), 
then Xt has a right-continuous version if and only if the map 1 1— >• EXj is right — continuous. 
First, we need a lemma: 



Lemma 2.7.11 Let Xt be an J^t^submartingale. (We do not assume that Xt is right- 
continuous, nor that J-t satisfies the usual conditions.) 

(i) There is an event il.* £ J- of measure 1 such that for every a; G (7* the limits 
Xt+iuj) = lim Xaiijj) Xt-iuj) = lim Xaioj) 

exist for all t > (resp. t > 0). 
(a) Moreover, 

E[Xt+\Tt]>Xt a.s. 
E[Xt\Tt-] > Xt- a.s. 

for allt>0 (resp. t > 0). 
(Hi) Xt+ is a J^t+ ^^"^bmartingale with almost every sample path cddldg. 
(The set Q may be replaced by any countable dense subset ofM.) 



Proof: (i) For a < 6 G Q and n G N, define 

^2 = G 17 : C/[o,n]nQ(^(w); [a,b]) = oo}, 

Arguing as in the proof of Upcrossing Lemma, we see that IP(^i"b) = 0, for otherwise 
^U[o^n]nQ{X{^); [a,b]) = oo. 

Now if t > 0, choose n G N such that t < n. If a = limmiqit,q£QXg{uj) <b = limsupq^t,q£Q-^qi'^) 
then Lo G A^^^. Thus lim^j^^^ggQ Xq{uj) exists for almost all u. Similarly, lim.q^t,qeQ ^qii^) exists 
for almost all lo. 

(ii) Let Qn it (strictly) where qn G Q. Put Yn = Xq^, Gn = J^q^- Then 1^ is a reversed Qn- 
submartingale, and Yn — )■ Xt+ a.s. (by definition of Xt+ in (i)). Moreover, El^ > EX^ > — oo 
for each n. By the Levy-Doob Downward Theorem, the family Yn is UI. Since Yn ^ Xt a.s., 
we now have Yn ^ Xt ui as well, by Theorem 



2.4.5 



Now if A G then E[Xt; A] < E[y„; A], by the submartingale property, so that E[Xt; A] < 
E[Xj+; since Yn ^ Xt in jC-*^. Thus we have shown that E[Xj+|J-i] > Xt. 

To prove the second inequality, choose r„ G Q with r„ f f (strictly). Then by the Strong 



Dominated Convergence Theorem (Theorem 2.5.5), E[X(|J>.,J — )• E[Xt|J-"^-] as n — )• oo, and by 



the submartingale property, X^^ < E[Xt|J>,J. Letting n — oo, we get the second inequality. 

(iii) it is easy to see that Xt+ is adapted to Tt+. Moreover, if s < t and qn i s (with 
each qn < t), then E[Xj+|Jg,J > E[Xf|7g^] > Xq^^ a.s., by the first inequality of (ii) and the 



Martingale Theory 



59 



submartingale property. By the Strong Dominated Convergence Theorem, we therefore have 
E[Xt+\Tt+] = limE[Xt+\J^gJ > limXg^ = X,+ a.s. 

n n 

Thus Xi+ is indeed an 7^^+ -submartingale. To see that almost every sample path of X^+ is 
cadlag is now easy, using (i). 



Having proved the lemma, we can now give conditions under which a submartingale has 
a cadlag modification: 

Theorem 2.7.12 Let Xt he a submartingale w.r.t. a filtration Tt satisfying the usual 
conditions. Then Xt has a cadlag modification (which is also a submartingale adapted to 
Tt ) if and only if the function 1 1— )■ EXj is right- continuous. 



Proof: First assume that the map t i— )• EX^ is indeed right-continuous. Since J-t+ = Tt 



(by the usual conditions), the process Xt+ defined in Lemma 2.7.11 is a /"f-submartingale 
with a.s. cadlag sample paths. It therefore remains to show that Xt+ is a version of Xt, 
i.e. that P(Xt = Xt+) = 1 for all t > 0. So let i > and choose rational numbers qn J, t. 
By the Levy-Doob Downward Theorem, the reversed submartingale {Xq^jTg^) is UI. Since 
Xq^ — )• Xt+ a.s., we therefore must have lim^EXg^ = ¥,Xt+ by The orem [2.4.5 But by 



Xt+ > Xt 



hypothesis, KXt = lim„, EXg,^, and so EX(+ = KXt. However, by Lemma 2.7.11 
a.s. (since Tt+ = ^t)- Thus Xt+ = Xt a.s. 

For the converse, suppose that if 1^ is a right-continuous modification of Xt. Pick t > 0, 
and let i„ J, t. We must show that EXt = lim„EXf^. Now because 1^ is a modification of Xt, 
we have 

FiXt = Yt,yn>OiXt„=YtJ) = l 

(since this involves a countable intersection of sets of measure 1). Also Yt = lim„ 1^^ a.s., 
by right-continuity of Yt. It follows that Xt = lim„Xf,j a.s. But, again by the Levy-Doob 



Downward Theorem, the family Xt^ is UI. Hence EXj = lim„ EX^^ by Theorem 2.4.5 



Immediately, we have: 



Corollary 2.7.13 Let Xt be a martingale with respect to a filtration that satisfies the 
usual conditions. Then Xt has a cadlag modification. 

Next, we tackle continuous-parameter versions of the martingale inequalities. 



Theorem 2.7.14 (Doob's Maximal Inequality) 

Let Xt be a cadlag submartingale, let [S,T] be a subinterval and let c> be a real 

numbers. Then 



cP( sup Xt>c)< E[X+] 

S<t<T 



Proof: Note that if X is a submartingale, then X"*" is a non-negative submartingale. This 
follows from the fact that <f{x) = x'^ = max{x, 0} is convex, and Jensen's inequality. Let F' 



60 



Continuous-Parameter Martingales 



be a finite subset of [S, T] n Q, and let F = F' U {S, T}. Also let < c' < c. By the discrete 
version of Doob's maximal inequality, we have 

c'P(supXt > c') < c'P(supX+ > c') < EX+ 

This is true for all finite F' C [5, T] n Q. Now choose an increasing sequence F^ of finite 
subsets of Q with the property that [Jn K = [S, T] n Q, and let = F„ U {S, T} for each n. 
Let G = We now see that 

c'P(supXj > c')<E[X+] 
teG 



as follows: Note that 



where 



{supXt> c'} = \\An 

teG ^ 



n 



An = { sup Xt>c} 

teFn 

Note that An is increasing, and thus it follows by monotonicity properties of measure that 

P({supXt > c'}) = lim„P(^„). But c'F{An) < and thus c'P({supXf > c'}) < EX+. 

teG teG 

By right-continuity, we have c'P({ sup Xt > c'}) < EXji. Finally, letting c' f c yields the 

S<t<T 

result. 



Theorem 2.7.15 (Doob's i2P-inequality) Suppose that 1 < p < oo and that q is defined 
so that ^ + ^ = 1. Let X be a non-negative suhmartingale hounded in , and define 

X* =supXt 

Then X* G jOP, and 

\\X*\\p < qsup\\Xt\\p 

t>o 

The suhmartingale X is therefore dominated hy X* G C^. 



Proof: The proof of the discrete-parameter version depends on Doob's maximal inequality, 
which we have just established. Thus the proof of the discrete— parameter theorem will also 
work for the continuous-parameter version. 

H 

Doob's >C^-inequality is very useful for proving convergence results. For example, suppose 
that Xt is a martingale bounded in C^, where p > 1. By Jensen's inequality, \Xt\P is a 
non-negative suhmartingale, and by the /^''^-inequality, \Xi,\P is dominated by the integrable 
random variable {X*y. Thus if Xt ->■ Xoo a.s., then \Xt — Xoo\'^ ->■ a.s., and \Xt — Xoo\^ 
is dominated by {2X*)p G C^. By the Dominated Convergence Theorem, we thus have 
Xt —7- Xoo in as well. 

The following corollary is just a reformulation of Doob's -inequality: 



Martingale Theory 



61 



Corollary 2.7.16 Suppose that 1 < p < oo and that M is a martingale bounded in . 
then 

E[ sup \Mt\P] < (^)\[\MTn 

0<t<T \P-^J 

In particular, E[ sup M^] < 4M[M^]. 

0<t<T 



2.7.3 Optional Sampling 

We now turn our attention to a continuous-parameter version of the optional sampling the- 
orem. In order to be able to use the discrete-parameter version, we need to be able to 
approximate a cadlag martingale by a discrete version. As in the proof of the Discretization 
Lemma, we define, for each n G N, 

D+ = {k2-'' -.ken} 

to be the set of non-negative dyadic rationals of order < n. Note that m < n implies 
— ^n' ^'^d that Un^'^n ^ Countable dense subset of M^. Note also that if X is a 
martingale, then {Xq : q € D^) is a discrete-parameter martingale, for each n. 



Lemma 2.7.17 Let X he a cddlc 


ig supermartingale, 


and let r 


he a stopping time. Let 


t >0, and define 










: q > t{uj)} tn = 


inf{g G I 


5+ : g > t} 


Then Tn is a stopping time w.r.t. 


the filtration {Tq : q 


GD+). 


We also have Tn i t and 


J^m i ■ Moreover 










„ Xr/\t a.s. and 


in C} 




and thus XrAt G 









Proof: The t„ defined here is exactly the same as the Tn defined in the Discretization Lemma. 



Thus each t„ is a stopping time and r„ J, r. It follows from Theorem 2.7.3 that Jv,^ J, Jv as 
well. Note also that t„ 1 1, and thus that r„, A t„ J, r A t. 

Using the discrete version of the Optional Sampling Theorem (applied to the closed 
discrete-parameter supermartingale {X^./2"+'^ ■ < t + 1)), we see that 

and similarly that EXr„Atn < '^Xq. So define y„ = Xr^M„, On = J^T„Atn- Then we have 

and 1^ is a reversed ^n-supermartingale. Moreover, lim„El^ < EXq < oo. By the Levy- 
Doob Downward Theorem, Y is UI and the limit 

^TAt = ^oo = hm Yn 

n— ^-oo 

exists a.s. and in (where we used the fact that X is cadlag). 



62 



Continuous-Parameter Martingales 



H 

Suppose that X is a continuous-parameter stochastic process, and that r is a stopping 
time. Wc define the stopped process X'^ by X^ = X^-At. The following result cannot be 
unexpected: 



Theorem 2.7.18 (Stopped (super) martingales are (super)martingales) 

Suppose that Xt is a cddldg (super)martingale (w.r.t. a filtration that satisfies the usual 
conditions). If t is a stopping time, then X'^ is also a cddldg (super)martingale w.r.t. the 
same filtration. 



Proof: Suppose that Xt is a cadlag supermartingalc, and let < s < t. Define r„,t„ as 
in the above lemma, and define Sn analogously (i.e. s„ = inf{q G D+ : q > s}). By the 
discrete-parameter "Stopped (super) martingales are (super) martingales" theorem, it follows 
that 

^[^r„At„ iJ'sJ < Xr„Asm for m > n 

(using the parameter set O^.) Now let m — t- oo. Note that lim„EXT-„At„ ^ IEXq < oo, so by 
the Levy-Doob Downward Theorem and right-continuity, 

Now let n — >■ oo. Then 

nXr^MjTs]^nXrAt\Ts] 

by the preceding lemma, and 

by right-continuity. Putting the results together yields 

^[XrAtl^s] = XrAs a.S. 

as required. 

H 

In the discrete-parameter theory, the preceding result was proved using a martingale 
transform. We have not yet defined continuous-parameter analogues of martingale transforms 
and previsible processes, but that is because these are difficult to generalize, and will need 
quite a bit more theory: The generalization of the martingale transform to continuous-time 
is the stochastic integral. 

We can now prove: 

Theorem 2.7.19 (Doob's Optional Sampling Theorem for closed cadlag supermartin- 
gales) 

Suppose that X is a closed cddldg supermartingalc, and let a, r be stopping times. Then 
Xt, X^ G C^, and 

E[X^\Tr] < Xr a.s. ]E[Xt-|J'(,] < XrAa a.S. 

with equality if X is a closed martingale. 



Martingale Theory 



63 



Proof: First assume that a < t a.s. Let X^o be a last element of X, i.e. X^o is J-oo- 
measurable and integrable, with E[Xoo|^t] < Xt for all t (and equality if X is a martingale). 
Define Tn,crn as in Lemma 2.2.10. Note that (T„ < t„. Note that r„ 4- a-^id thus that 



— )• Xt- a.s. (by right-continuity). Furthermore, J>„ | -^r> by Theorem 2.7.3, and 
^Xt-„ = KXr„Aoo < IEXq < oo (by the discrete Optional Sampling Theorem applied to the 
stopped supermartingale Xl'^). Thus by the Levy-Doob Downward Theorem, the family Xt-„ 
is UI, and thus Xj-^ — )• X-j- in as well, and E[Xt-„|Jv] < X.^ a.s. (with equality if X is a 
martingale). It follows that Xr is integrable. Similarly, the family Xo-„ is UI, and — )■ X^^ 
a.s. and in C^, E[Xo-,J-7v] < Xcr a.s., and Xa- is integrable. 

By the discrete Optional Sampling Theorem (on the parameter set O^), we have 

E[X.J J-.J < a.s. 



and by Proposition 2.5.4 ;b), ElXrjT^J ^ E[X^|J'^] in C^. Hence there is a subsequence 
Tifc such that E[Xt-„^ l-Fo-^^] — )■ E[Xt-|J>] a.s. Since also E[Xt-„^ | Jv^^] < X^-^^ a.s., we obtain 
E[Xt-|J>] < Xo- a.s. upon letting A; — )• oo on both sides. We have now proved the Optional 
Sampling Theorem in case a <t a.s. 

Now Let (T, r be arbitrary. We want to show that E[Xt-|-Fo-] < Xt-ao-- Since a A t < t, 
we can apply the version of the Optional Sampling Theorem just proved to deduce that 
E[Xr\TaAT] < X^Ar- Now if vl G then An{a <t} € J"^Ar, by Theorem [2T3]^g) . Thus 



E[Xr; A] = E[Xr; An{a<T}]+ E[X^; An{a> r}] 

But E[X,-; An{a < r}] < E[X^A(t; An{a < r}] (with equality if X is a UI martingale) because 
E[Xt-| J>at] < Xo-Ar and Ar\{a < t} £ J^aAr- Also, X-r = XrAa ou the set Ar\{a > r}. The 
result now follows. 



Remarks 2.7.20 The following facts maybe useful when applying the Optional Sampling 
Theorem: 

1. A UI supermartingale is closed (but not necessarily vice versa in the continuous-parameter 
case). 

2. A martingale is closed iff it is UI. 



□ 



Exercise 2.7.21 (a) If M is a martingale and K a finite non-negative real, then the stopped 
martingale is UI. 

(b) Let M be a martingale. Suppose that a, r are stopping times, and that the stopped 

martingale Ml is UI. Then the stopped martingale M[^" is UI. 
[Hint: (b) Note first that M[ — ;> Mr a.s. and in £^ as t — ;> oo (why?) and thus that M^ £ C} . 
Next, note that M^^^'^ = E{Ml\TaAt\ = E[Mr\FaAt\- Theorem [2X8] 

□ 



Next, we characterize all UI cadlag martingales (cf. Theorem 



2.4.14): 



64 



Continuous-Parameter Martingales 



Theorem 2.7.22 Suppose that M is an adapted cddldg process with Mq = such that 
for every stopping time t < oo we have 

E\Mr\ < oo EMr = 

Then M is a UI martingale. 
Proof: With r = oo, we see that M^o £ (where Maoi^^) is defined to be Um Mt{Lo) if this 

t—>-co 

hmit exists, and otherwise). To show that M is a UI martingale, it suffices to check that 

Mt = E[M^\Tt] 

So let F eTt. Define 



t if w G F 
oo else 



Then r is easily seen to be a stopping time. Now 

E[Moo] = E[Moo; F] + E[M^;n - F] = 
E[Mr] = E[Mt; F] + E[M^; n - F] = 

It follows that E[Mt;F] = E[Moo;F]. Since this is true for ah F e Tt, must have Mt = 
E[Moo\J^t], as required. 

H 

2.7.4 Spaces of Martingales 

A martingale M is called an /I^ -martingale if and only if sup^ \Mt\^ < oo, i.e. if and only if 
M is bounded in C^. An £^-martingale is often called square-integrable. 

The spaces of square-integrable martingales will be important for the definition of the 
stochastic integral. When we define the stochastic integral, this will be done in several stages. 
At one stage, we will consider it to be the limit of a sequence of square-integrable martingales. 
This limit must also be a square-integrable martingale, so the space of martingales must be 
complete. 

Recall that if M is a martingale which is bounded in is necessarily bounded in C^, by 



Holder's inequality, and is also UI, by Proposition 2.4.6 Thus if M is cadlag, the Martingale 
Convergence Theorem guarantees the existence of a random variable M^o such that Mt — >• Mqo 
a.s. and in C^. But does it also converge in The following theorem proves that this is so: 

Theorem 2.7.23 Suppose that M is a cddldg martingale bounded in C^, where p > 1. 
Then there is a random variable M^o £ such that 

Mt Moo a.s. and in 

and 

snpE[\Mt\P] = lim E[\Mt\P] = E[\M^\p] 



Martingale Theory 



65 



Proof: We repeat the argument presented just after the proof of the £^-inequahty (Theorem 
2.7.15): Because M is UI (Proposition 2.4.6| ), it is clear that there is a random variable 



Moo such that Mt — )■ M^o a.s. and in £ (by the Martingale Convergence Theorem). Now 
clearly \Mt\^ is a non-negative submartingale (by Jensen's inequality). Define M* = sup |Mj|. 

Using Doob's /^^-inequality, we see that M* G C^, and that \Mt\ < M* for each t. Since 
Mt ^ a.s., we also have |Moo| < M*. Thus |Mj - Moo| < \Mt\ + |Moo| < 2M*, so 
that \Mt - Moo|P < {2M*)P £ C^. By the Dominated Conver gence Theorem we thus have 
\Mt - Moo|P ^ in i.e. Mt M^c in CP. 

Moreover, since \Mt\P is a submartingale, we see that if s < t, then 

E[|M,|P] < E[\Mtf] 

so that E[|Mt|P] is increasing (in t). Thus supE[\Mt\P] = lim E[|Mt|P]. But Mf MSo 

a.s., and since \Mt\P < \M*\p E £\ we also have E[\Mt\P] E[\Moo\p] (as t ^ oo) by the 
Dominated Convergence Theorem. 

H 

Let {Q,T,F, {J^t)t) be a filtered probability space, and suppose that M is a square- 
integrable martingale. Then Mt — ?■ M^o a.s. and in C^, and Mt = K[Moo\J't]- Similarly, 
given any random variable X € C^, we can define a square-integrable martingale M by 
Mt = ElXl^t]. Then Mt ^ M^ = E[X\Foo\ a.s. Moreover, EMf < EM^ < EX^ < oo, so 
that Mt is a square-integrable martingale with Mt — )• Moo in as well. 

There is therefore a correspondence between square-integrable martingales and square- 
integrable random J"oo~nieasurable variables, which means that the space of square-integrable 
martingales can be equipped with the same structure as the Hilbert space of square-integrable 
random variables. This motivates the following definitions: 

Definition 2.7.24 (a) M"^ = {martingales M : M^o G C^} 

(b) Ml = {M e : Mo = a.s.} 

(c) cM"^ = {M G M"^ : M is continuous a.s.} 

(d) cMl = Mln cM^ 
We define an inner product on by 

(M,iV) =E[AfooiVoo] 



The norm || • \ \j^2 on Ai^ induced by this inner product is 

||M||^2 = (EMi)^ = ||Moo||2 



Here the norm on the left is the norm of the martingale, whereas the norm on the right is 
the usual £^-norm of the random variable M^o. We have used the same notation for these 



norms. Note also that, by Theorem 2.7.23 



||M||2 = SUp||Mt||2 

t 

(where again the norm on the left is the A^^-norm and the norm on the right the i2^-norm). 



66 



Continuous-Parameter Martingales 



Theorem 2.7.25 The spaces M'^,Mq,cM'^,cMq are Hilbert spaces. 



Proof: It ought to be clear from the foregoing discussion that A^^ is isomorphic to the Hilbert 
space £■^(^7, J-"oo, IP), and thus itself a Hilbert space. It is trivial to show that all the other 
spaces inherit the Hilbert space structure from £^ as well, except possibly for completeness. 

We check that cMq is complete; the proofs for the other spaces are easy adaptations. So 
let M^*^) be a Cauchy sequence of martingales in cA^q. Each M^"^ converges to some 
a.s. and in Now because of the way the norm is defined in M"^, it follows that is a 

Cauchy sequence in the complete space and therefore converges. Thus define 

Moo = hm M^^ (limit in C'^) 

n— >oo 

Mt = E[M^\Tt] 

It is clear that M so defined is a square^ntegrable martingale, but we still need to check that 
M is a.s. continuous and that Mq = 0. Now for each n, M^"^ — M is a square-integrable 



martingale, and so by Theorem 2.7.23 we have 



oo 



supEpfi") - Mtf] = E[{Mt^ - Moo)'] = ||M(") - M||^2 ^0 as n 
t>o 

In particular, EM^ = E(M^"^ - Mq)^ 0, so that EM^ = 0. Thus Mq = a.s. 

By Doob's £Mnequality, we see that HsupjMj^"^ - Mj| ||2 < 2supJ|M/"^ - Mt\\2 = 
2||Mi^^ - Moo lb- Thus supjMj^"^ - Mt\ ^ in C"^, and thus in probability. It follows 
that there is a subsequence such that supilM^^""^ - Mt\^0 a.s. (as k — )• oo), and thus 
that M^"*'-' — )• M uniformly almost surely on the interval [0,oo]. Since the uniform limit of 
continuous functions is continuous, M is also continuous a.s. 



2.7.5 Local Martingales 

Definition 2.7.26 Let P be a property. We say that a stochastic process Xt is locally 
P if and only if there is a sequence of stopping times t„ t oo s.t. each stopped process 
Xl'^ is P. However, additional requirements may also have to be met. The sequence r„ is 
called a reducing or localizing sequence for Xt . 

Thus we can speak of local martingales, local submartingales, etc. Here, however, we will 
want to impose some extra requirements: 

Definition 2.7.27 A stochastic process X is called a local martingale (w.r.t. (-7^f)t>o) 
if and only if there are stopping times r„ t oo such that each shifted stopped process 
X^" — Xq is a martingale w.r.t. the filtration (JvAt)t>o- The sequence of stopping times 
Tji is called a localizing sequence, and is said to reduce X. 

The reason for specifying that X^ — Xq be a martingale, rather than just X^ , is that 
one may encounter circumstances where Xq may not be integrable. For example, the initial 
distribution of the process may be weird, but the process is well-behaved from there onwards. 
In many applications Xq will be 0. 

Note that any martingale is a local martingale: Simply let = n. 



Martingale Theory 



67 



Remarks 2.7.28 There are several reasons why it is often easier to work with local martin- 
gales than with martingales: 

(1) In the first place, it frequently stops us from having to fuss about integrability, For 
example, if X is a martingale, and 99 is a convex function, then (p{X) is always a local 
submartingale. It will be a submartingale only if each E|(^(Xt)| < 00, which may be 
difficult to check. 

(2) Often we deal with a process X that is only defined on a random time interval [0, r). 
If r < 00, then the concept of martingale does not make sense, because Xt{u}) will be 
defined only if t < t{u}). It is easy to define a local martingale, however: There are 
stopping times Tnt ^ such that each X"^" is a martingale. 

(3) The extra generality obtained by working with local martingales is not offset by an in- 
crease in complexity of the proofs: Since most of the theorems will be proved by intro- 
ducing stopping times to reduce the problem to a question about "nice" martingales, the 
proofs for local martingales are no harder than those for ordinary martingales. 



□ 



Note that not every local martingale X is a martingale, not even if each Xt is integrable. 



Exercise 2.7.29 The definition of local (sub) martingale differs in some texts. In particular, 
Xt is often defined to be a local martingale if and only if there is a reducing sequence such 
that each Xr„At — Xq is a martingale with respect to the filtration {J-t)t>Q- The aim of this 
exercise is to show that this doesn't affect the definition: XJ — Xq is an J^^-martingale if and 
only if it is an Jv/\j-martingale. 

(a) Show that if J-t satisfies the usual conditions, then so does the filtration J>At- 

(b) Prove the trivial fact that any J^j-martingale adapted to J-tm is a JvAt^martingale. 

(c) Suppose that Xt is cadlag and adapted to a filtration Tt that satisfies the usual conditions. 
Let r be a stopping time such that XJ — Xq is a J>At^™artingale. Show that XJ — X^ 
is a -martingale. 



[Hints: (a) Theorem 2.7.3 (c) Let F G Ts- Note that F n {s < r} G JVas- Now note that 



¥.[Xl - Xo; F] = E[Xl -Xo;Fn{s< r}] + E[Xl - Xq; F n {r < s}].] 

□ 



It must be stressed that, though every martingale is a local martingale, the converse is not 
true. We shall see some examples once we've covered stochastic integration. Nevertheless, we 
sometimes will want to know when a local martingale is actually a martingale. The following 
will aid in this regard: 



Proposition 2.7.30 Let p G [l,oo) and let X be a local -martingale, reduced by Tn- If 
||j^r„|p ■ j2 G N} is UI for each t > 0, then X is an -martingale. 



68 



Continuous-Parameter Martingales 



Proof: Note that |Xo| = |XoAr„| for each n. Thus Xq G (by the UI assumption with 
t = 0). It follows that each X^" is an £^-martingale, and thus we have X^'^ — )■ Xt a.s. as 
n — )• oo. Moreover, the UI assumption implies that convergence is in as well (Theorem 



2.4.5). It follows that E[X^^"|j;] IE[^t|-Fs] in CP. Putting the pieces together yields the 



result. 



Corollary 2.7.31 If Xt is a local martingale and i/E[ sup \Xs\] < oo for each t > 0, 

0<s<t 

then Xt is a martingale. 



Exercise 2.7.32 Prove the preceding corollary. 

□ 



Proposition 2.7.33 If Xt is a local martingale, and if t is a stopping time, then XJ is 
a local martingale, and any sequence that reduces Xt will also reduce XJ . 



Proof: Suppose that (T„ t oo reduces Xt. Then X^" — -'^o is a martingale, and so X'l'^^'^ —Xq = 
(X^" —XqY" is a martingale (Stopped martingales are martingales). The result follows easily. 

H 

The following proposition provides a nice class of stopping times which form reducing 
sequences for continuous local martingales: 



Proposition 2.7.34 Suppose that Xt is a continuous local martingale. Let t„ = inf{t : 
\Xt\ > n}. Then Tn reduces X. Indeed, let < Tn be a sequence of stopping times with 
t . Then reduces X . 

Proof: That t„ are stopping times we leave to the next exercise. Since stopped local martin- 
gales are local martingales, each X^"- is a local martingale. Moreover, since Xt is continuous, 
we have 

E sup \X^" \ < n < oo for all t 

0<s<t 



which implies that X^"- is a martingale (by Corollary 2.7.31), and thus that reduces X. 

H 

Exercise 2.7.35 Prove that if Xt is continuous, then = inf{t : \Xt\ > n} is a stopping 
time. 

[Hint: By continuity, {Tn > t} = f] {\Xq\ < n}] 

q<t,qGQ 

□ 



Chapter 3 

Prelude to Stochastic Integration 



3.1 The Riemann— Stieltjes Integral: Motivation and Defini- 
tion 

Example 3.1.1 It is well-known that area under a continuous curve g{t) over the interval 
[a, b] is given by the Riemann integral g{t) dt. In elementary calculus courses, this integral 
is defined as a limit of sums. Roughly, one partitions the interval [a, b] into equally spaced 
points a = to < ti < t2 < ■ ■ ■ < tn = b, and chooses an element t*f, G [tk-i,tk] for each 
k = 1, . . . , n. One then considers the Riemann sums 

n n 
k=l k=l 

The Riemann integral ^^g{t) dt is defined to be limit of these Riemann sums as At 0, 
assuming this limit exists. 

This description of the Riemann integral is not completely precise — you will learn how 
to make it precise shortly. 

□ 

Example 3.1.2 Suppose that X is a random variable with values in the interval [a, 6], and 
with distribution function F : M -> [0, 1] : x ¥{X < x). Let : M ^ M be an arbitrary 
continuous function. To estimate E[^(X)], one can proceed as follows: Partition the interval 
[a, b] into subintervals with endpoints a = xq < xi < ■ ■ ■ < Xn = b and choose x^ G [xk^i,Xk] 
for A: = 1, . . . , ra. If A^x := x^ — x^-i is sufficiently small, then the function g, being continu- 
ous, is almost constant on the interval [xk-i,Xk], and has value « g^x^.) on that interval. We 
can now approximate X by a discrete random variable X with values in {x*, . . . , x*} defined 

by 

X := xl <S=^ X G {xk-i,Xk\ 

So we have 

n 

E[5(X)] « K\g{X)] = ^5(4)P(X = xl) 
k=l 

But X = x*f, just when X G {xk-i,Xk], i.e. 

¥{X = xl) = ¥{xk-i <X<Xk)= P(X < Xk) - P(X < Xk-i) = Fixk) - F{xk-i) 



69 



70 



The Riemann-Stieltjes Integral: Motivation and Definition 



Thus 

n 

E[5(X)]«5^<7(4) {F{xk) - F{xk-i)) =J29i^l) 

k=l k 

The hmit of these sums, as Ax — )• is a Riemann-Stieltjes integral, and denoted g{x) dF{x) 

□ 

Example 3.1.3 Suppose that at time t, you own ^(f)-many shares, and that the share price 
at time t is S{t). What is your gain/loss over a time period [0,T]? 

At time t, you have 6{t) shares. Between times t and t + A(t), the share price changes by 
AS{t) := S{t + At) — S(t), and so your gainover that period is ^ 9{t)AS{t). To approximate 
your total gain over the period [0,T], partition the interval = to < < ■ ■ ■ < *n = to 
obtain 

n 

Gam^^B{tk)AS{tk) 

k=l 

Intuitively, the approximation becomes more and more accurate as At —t- 0, so the gain of 
the portfolio 9 is, roughly, the limit 9{t) dS{t) = limAt^o Ylk=i ^{tk)AS{tk) provided this 
limit exist^ 

□ 

We have now seen several situations where we need to determine the limit of sums 
Y12=i fi^k) dG{tk), where the limit is taken as the At — )• 0. The description so far has 
been intuitive and imprecise. It is now time to introduce some rigour. 

Let /, G be real-valued functions defined and bounded on an interval [a, b]. A partition P 
of [a, 6] is a a finite ordered set {a = to < ti < t2 < • • • < tn = 6}. The size of such a partition 
is denote cr{P), and defined by 

a{P) := max(tA; - tfc_i) 

k 

A tagged partition is a partition P together with a choice t^ € [tfc-i, tk] for each k = 1, . . . ,n. 
Tagged partitions will be indicated by a *, i.e. if P is a partition, then P* dentes an associated 
tagged partition. 

With each tagged partition, we can associate a Riemann-Stieltjes sum (abbreviated RS 
sum) 

n n 

S{P*, /, G) := fitl) {G{tk) - G{tk-i) = fit*k) ^kG 

k=l k=l 

The Riemann-Stieltjes integral (abbreviated RS integral) / dG should be the limit of the 
RS sums, over all tagged partitions P*, as o-{P) — )• 0. To be precise, we say 

lim 5(P*,/,G) = L exists 

if and only if for every e > there is > such that 

\S{P, f,G)-L\<e whenever a{P) < 6 

^Unfortunately, for most reasonable models of asset dynamics S{t), this limit does not exist; a stochastic 
integral must be used. . . 



The Riemann-Stieltjes Integral 



71 



Then we define ^ 

/ fdG:= lim S{P*,f,G) 

provided this limit exists, and say that / is Riemann-Stieltjes integrable with respect to G 
on [a, b]. 

When the function G is the identity function G{t) = t, the Riemann-Stieltjes integral is 
just the ordinary Riemann integral. 

With each partition {a = to < *i < ■ ■ ■ < = ^} it is possible to associate three natural 
tagged partitions, namely those having tags equal to the left endpoint, right endpoint and 
midpoint of each interval. This yields: 

• The lefthand RS sum Y.k f{tk-i)^kG; 

• The righthand RS sum X^fe f{tk)^kG; 

• The symmetric RS sum J2k fC-^'=^)AkG. 

If / is RS integrable w.r.t. G, then each of these sums must converge as cr(P) 0, and all 
to the same limit. 

Remarks 3.1.4 A slightly different definition uses Darboux sums rather than RS sums. Given 
real-valued functions /, G defined and bounded on an interval [a, b] , and a partition P = {a = to < 
ti <■■■< tn = b}, let the upper and lower Darboux sums be defined by 

n 

U{P,f,G) := ^sup{/(i) : t e [tk-i,tk]} ■ {G{tk) - G{tk-i)) 

n 

L{Pf,G) := ^inf{/(0 : t G [tk-i,tk]} ■ {G{tu) - G{tk-i)) 
fe=i 

If / is continuous on [a, b] , it attains its supremum and infimum on each subinterval, i.e. we can choose 

^max^^min g [tk-l,tk] SUCh that 

mn = snpim ■. t e h-utk]} mn = inf{m ■. t e h-utk]} 

If p* max, p* are the tagged partitions given hy a ^ to <■■■< tn = b and the tags tfe^^^.^r" 
respectively, then it is easy to see that 

UiP, /, G) = S{P* /, G) L{P /, G) = S{P* /, G) 

i.e. the Darboux sums give the most extreme values of the Riemann-Stieltjes sums for any given 
partition. However, the Darboux sums may differ from Riemann-Stieltjes sums if / is not continuous. 

The Ricmann-Sticltjcs sums may be defined even when /, G arc Banach space valued, however, 
whereas the Darboux sums, being dependent on sup's and inf's, make sense for real-valued functions 
only. 

□ 

Now that we have defined J"^ / dG as a certain limit, the first question that begs our 
attention is the following: Under what conditions does this limit exist? 
Here are a few instances where the answer is obvious: 



72 



Functions of Finite Variation 



Example 3.1.5 (1) If G = id is the identity function on [a,b] (i.e. G{t) = t for all t), 
then f dG is just the ordinary Riemann integral of / with respect to t, and will exist 
whenever / is, for example, piecewise continuous on [a,h\. 

(2) If f{t) = 1 for ah t e [a, 6], then 

S(P*,/,G) = Y.{G{tk) - Gitk-i)) = Gib) - G{a) 

k 

for all tagged partitions P, and hence 1 dG = G(b) — G{a) for any function G. 

(3) If G is differentiable, with G'{t) = g{t), and {a = to < ti < ■ ■ ■ < tn = b, there is by the 
Mean Value theorem a G [tk-i,tk] such that 

G{tk)-G{tk-i)=g{tl){tk-tk-i) 

Thus 



and so 



S{P*, f,G) = J2 fitDMrn - tk-i) = S{P, fg, id) 



f fdG= f f{t)g{t) dt = f f{t)G'{t) dt 

J a J a J a 



is again just an ordinary Riemann integral. It will exist if, for example, /, G' are both 
piecewise continuous on [a, 6] . 

□ 

Example 3.1.6 Here is a simple example which shows when the Riemann-Stieltjes integral 
may fail to exist. Define / : [0, 1] ^ M by 



fit) = 



Oift<l 
lift>l 



Coinsider partitions {0 = io < < ■ ■ ■ < = 1} of [0, 1] with the property that the point 
I is not one of the endpoints t^. For each such partition, there is a unique ka such that 



ifco-i <\<'tko- Then 

■' [ 1 if it = fco 

Looking at the left- and righthand RS sums, we see that 

fitk-i)^kf = Yl fiik)Akf = 1 

k k 
for all partitions, no matter how fine. Hence f df does not exist. 

□ 

This argument can easily be adapted to prove that f dG cannot exist if /, G share a 
common point of discontinuity in [a, b]. 



The Riemann-Stieltjes Integral 



73 



3.2 Functions of Finite Variation 

Suppose that /, G are bounded on [a, b]. Let M be a bound for /, so that \f{t)\ < M for all 
t G [a,b]. Then 

n n 
k=l k=l 

For f dC to exist for a variety of functions /, it is therefore necessary that the quantity 

n 

Inn V|G(i,)-G(ife_i)| 

does not get out of hand, i.e. does not diverge to +00 as partition meshes get smaller and 
smaller. This motivates the following definition: 

Definition 3.2.1 Let G : [a, b] R, and let P = {o = to < *i < ' ' ' < ^n} be a partition of 

[a, b]. Define the variation of G on P by 

n 

Vp{G; [a,b]) = J2\G{tk)-G{tk-i)\ 

k=l 

Define the (total) variation of G on [a, b] by 

V{G; [a,b]) = sup VpiG; [a, 6]) 
p 

where the supremum is taken over all partitions of [a,b]. 

The function G is said to be of finite variation on [a, b] provided that V{G; [a, b]) < 00. 

□ 

A function is said to be of locally variation if its variation on every compact interval is finite. 

If P, Q are partitions of [a, b], we say that Q is finer than P iff P C Q. This simply means 
that every subdivision of P is a union of subdivisions of Q. Using the triangle inequality, it 
is easy to see that if Q is finer than P, then Vp{G) < Vq{G), and thus that Vp{G) increases 
as P gets finer and finer. Thus lim^^^p^^o ^p{G) = supp Vp(G) = V{G). 

The quantity V{G; [a, b]) is very important: If V{G; [a, b]) = 00, we may well expect that 
f dG does not exist for a variety of functions /. 

The following properties are easy to prove: 

Proposition 3.2.2 (a) If f is increasing on [a, 6], then [a, 6]) = f{b) — f{a). Thus any 
monotone function is of locally finite variation. 

(b) Ifs<t, then V{f; [a,s]) < V{f- [a,t]). 

(c) If f is differentiahle, then V{f; [a,b]) = ds. Thus any continuously dijferentiable 
function is of locally finite variation. 



(d) V{f;[a,b])>\f{b)-fia)\. 



74 



Functions of Finite Variation 



(e) If a eM. and if f,g are of (locally) finite variation, then so are f + g and af, and 

V{f; [a,t]) < V{f; [a,t]) + V{g; [a,t]) V{af; [a,t]) = \a\V{f; [a,t]) 
Hence the family of functions of bounded variation on a compact interval is a vector space. 

(f) Ifa<b< c, then V{f; [a, c]) = V{f; [a, b]) + V{f; [b, c]). 

□ 

Exercise 3.2.3 (a) Prove the preceding proposition. 

(b) Calculate the variation of sin(x) over [0, 27r] directly from the definition. Verify your 
answer by using (c) of the preceding proposition. 

(c) Find all functions / on [a, b] with the property that V(f; [a, b]) = 0. 

□ 

Can you think of any bounded functions which are not of finite variation on a compact 
interval? At first glance, probably not. If you understood the previous exercise, you will 
know that the variation is essentially the sum of the sizes of all the "bounces" , i.e. the sum 
of the distances from each local minimum to the next local maximum, and from each local 
maximum to the next local minimum. 

Example 3.2.4 Let f{x) = x^. On [-2,3], / first bounces down from 4 to 0, and then bounces 
up from to 9. The variation V/(— 2, 3) is therefore equal to 4 + 9 = 13. 

□ 

This shows that a function is of bounded variation if it is not too bouncy, i.e. if the sum of 
all the bounces does not add up to +oo. 



Exercise 3.2.5 Consider the functions 



xsin(-) for .T 7^ 
f{x) = { X g{x) = { 

for X = 



x^ sin( — ) for x 7^ 

X 

for X = 



(a) Show that /, g are continuous. 

(b) Show that / is not of bounded variation on [— tt, tt]. 

(c) Sketch a graph of / to see why it isn't of finite variation. Also sketch a graph of g. 

(d) Show that g is of finite variation on [— 7r,7r]. 



The Riemann-Stieltjes Integral 



75 



[Hints: (b) f{x) reaches a local maximum for values x = (4„^i)^ and a local minimum for 



values x= (4^^. Thus 



n=0 



°° 1 1 1 1 



+ + + 



4ra + 1 4n + 3 4n + 3 4n + 5 



9 °° 1 

> Ayi 

~ 47r ^ n 

n=l 
= +00 

because the harmonic series diverges, 
(d) En converges.] 



□ 



Suppose that / is of finite variation on a compact interval [a,b]. For s,t e [a,b] define 
Vf{s,t) :=V{f; [s,t]). 

Proposition 3.2.6 /// is continuous and of finite variation on a compact interval [a,b] then 
Vf{a,t) is continuous in t. 

Proof: Since Vf{a,t) is increasing in t, we just need to rule out jumps. We first show that 
Vf{a,t) is left-continuous at t G [a,b], i.e. that Vf{a,s) t Vf{a,t) as s t Since 

Vfia,t) = Vfia,s) + Vfis,t) 

we need only show that Vf{s,t) — t- as s t ^- Now if this is not the case, then there is 
5 > such that Vf{s,t) > 6 for all s < t. Let si < t be arbitrary. Choose a partition 
Pi = {si = tl < tl < ■ ■ ■ < tl} of [si,t] such that 

El/(4)-/(4-i)l><^ 

k=l 

Since / is continuous at t we may choose S2 sufficiently close to t such that the inequality 
still holds if we replace the last term \ f{t) — f{t\_i)\ in the above sum by |/(s2) ~ /(^n-i)l- 
This proves that there is S2 such that Vf{si, S2) > 6, which shows that 

Vf{a, t) > Vfisut) = Vf{si, 82) + Vf{s2, t) > 26 

Now repeat the entire argument, replacing si by S2, to find an S3 very close to t so that 
Vf {32,83) > 6. We then have 

Vf{a,t) > Vf{si,t) = Vf{si,S2) + Vf{s2,S3) + Vf{s3,t)>SS 

Next find S4 sufficiently close to t such that Vf (53,54) > 5, to prove that Vf{a,t) > 46, etc. 
It follows that Vf(a,t) = 00, contradiction. Hence Vf{a,t) is left-continuous (in t). 
A similar argument will show that Vf{a,t) is right-continuous. 

H 



76 



Functions of Finite Variation 



The following result is tremendously useful: 

Proposition 3.2.7 A function f is of finite variation over a compact interval if and only if 
f can he represented as the difference f = g — h of two increasing functions g, h. Moreover, 
if f is continuous, then we can choose g,h to be continuous as well. 

Proof: It is easy to see that if g, h are increasing, then g — h is of locally finite variation. 
(Exercise!) Now assume that / is of finite variation on [a,b]. Define 

m = \ ma,t) + fit)] hit) = \ [Vfia,t) - fit)] 

It is clear that g — h = f, and that if / is continuous, then so are g,h (because Vf{a,t) is 
continuous in t). So it remains to show that g, h are increasing. But if s < t, then 

9{t) - g{s) = \ [Vf (a, t) - Vf{a, s) + fit) - /(.)] 

= \[Vf{s,t) + fit)-fis)] 
> 

because V/(s, t) > |/(t) — /(s)|. Thus git) > gis), and so g is increasing. A similar argument 
holds for h. 

H 

Exercise 3.2.8 Show that if fix) is of finite variation, it is discontinuous at at most count- 
ably many x. 

[Hint: Since / is the difference of two increasing functions, it suffices to show that an increasing 
function can have at most countably many points of discontinuity. Define 

An = {x : f is discontinuous at x with jump > — } 

n 

If / has uncountably many jumps, then one of the An is uncountable.] 

□ 

We now prove a criterion which guarantees the existence of f dG: 

Theorem 3.2.9 // / is continuous and G is of bounded variation over [a, b], then £f dG 
exists. 

Proof: Since / is continuous on the compact interval [a, b], it is uniformly continuous. Thus, 
given e > 0, we may choose 6 > such that \x — y\ < 6 implies \fix) — fiy)\ < £■ 
Suppose that 

P = {a = to <ti < ■■■ <tn = b} 
is a partition of [a, 6] with cr(P) < S, and choose u^, 1^ G [tfc_i, tfc] such that 



fiuk) = sup{/(i) : t G [tk-iM) fih) = mf{/(t) : t G [ifc-i, tfc]} 



The Riemann-Stieltjes Integral 



77 



Uk , Ik exist because a continuous function has a maximum and a minimum on any compact 
set. Let p*™^^^ p*mm ^j^g associated tagged partitions, with tags Uk, h respectively. If P* 
is any other tagged partition based on P, we clearly have 

< S{P*J,G) < 

so to prove that / is RS-integrable w.r.t. G it suffices to show that if A(P, /, G) := 
/, G) - S{P*^^, /, G), then lim,(p)^o A(P, /, G) = 0. 
Note that 

^^p*max^^^^) = ^/(^Xfe) AkG S{P*^'\f,G) = Y,f{lk) AfeG 

k k 

and thus 

A(P,/,G) = ^kG 

k 

Hence 

\A{P,f,G)\<eJ2\^kG\<eV{G;[a,b]) 

k 

Since V{G; [a, b]) is finite, by assumption, it follows that A(P, /, G) can be made arbitrarily 
small by choosing P to be sufficiently fine. 

H 

More can be said: If / is merely piecewise continuous on [a, b], and if G is of finite variation 
such that /, G have no common point of discontinuity, then f dG exists. 

is not true, however: / dG may exist even if G is not 



The converse of Theorem 



3.2.9 



of finite variation. The following theorem shows why: If / is continuous, and of bounded 
variation, and if G is continuous, but possibly of infinite variation, then / dG exists, 
because /j* G df exists (by Theorem 3.2.9). 

Proposition 3.2.10 (Integration by Parts) 

// / is integrable with respect to g, then g is integrable with respect to f, and 

gdf = f{b)g{b)-f{a)g{a)- t f dg 



Note that if f,g are differentiable, then this is just the ordinary integration-by-parts formula 
of first-year calculus. 

Proof: Let P = {a = t^ < ti < ■ ■ ■ < tn = b} he & partition of [a, 5], and let € 
for = 1 . . . n. Note that Q = {a = t^ < t\ < ■ ■ ■ < t*^ < = 6} is also a partition of 
[a, 6], and that as P gets finer, so does Q (and vice versa). Also note that tk G [^^,^^4.1], for 
A; = 1, . . . , n. Now 



E fi^DMtk) - g{tk-i)] = Y.9itk)[f{tl) - fitUi)] - f{a)g{a) + f{b)g{b) 

k=l k=l 

n 

= f{b)g{b) - f{a)g{a) -^g{tk)[m+,) - /(4)] 

k=i 

so the result follows by taking limits as cr{P) — )• 0. 



78 



Functions of Finite Variation 



Nevertheless, we do have the following partial converse to Theorem 3.2.9 It depends on 
the Banach-Steinhaus Theoreirj^ a result in Banach space theory. Omit the proof if, as I 
expect, you don't know this theorem. 

Theorem 3.2.11 // J^/ dG exists for all continuous f on [a,b], then G is of bounded vari- 
ation on [a, b] . 

Proof: Let C[a, 6] be the Banach space of continuous functions / : [a,b] M, equipped with the 
supremum norm. Assume that G : [a, 6] — M has the property that / dG exists for all / € C[a,b]. 
We shall prove that G is of bounded variation. Let P„ be a sequence of partitions of [a, b] with 

ll^nll^O. 

If Pn = {a ~ to < ti < • • ■ < tjn = b}, define a linear operator Tn : C[a,b] — > M by 

T„(/) = ^/(tfc-i) AfeG 

(where M is a Banach space equipped with the usual (absolute value) norm.) Let /„ G C[a, b] be such 
that fn{tk-i) = signAfeG, with ||/„|| = 1. Then 

m 

Tnifn) = 

k=l 

and so 

m 

||T„||>^|AfcG| 

It follows that 

sup||r„|| > VGia,b) 

n 

Now by assumption, lim„r„(/) = / dG exists for each / e C[a,b\, and hence the set {T„(/i) : 
n S N} is bounded. By the Banach-Steinhaus Theorem, the set {HTnH : n S N} is also bounded, i.e. 
sup„ llTnll < oo. It follows that Vaia, b) < oo, i.e. that G is of bounded variation. 



The following result may be found in any good (advanced) text on Real Analysis. 

Theorem 3.2.12 A function locally of bounded variation is differentiable almost everywhere. 

□ 

The converse is false: The function f{x) = xsvii{^) is everywhere continuous, and differen- 
tiable almost everywhere (except at x = 0), yet not of bounded variation. 



^AIso called the Principle of Uniform Boundedness. 



Prelude to Stochastic Integration 



79 



3.3 The Lebesgue— Stieltjes Integral 

It is not hard to generalize the Riemann-Stieltjes integral to a Lebesgue-Stieljes integral. 
Recall that Lebesgue measure A is a measure on (M, i3(M)) which assigns to each interval 
(a, b) its length, e.g. A(a, b] = b — a. Similarly, if G is a right-continuous increasing function, 
we can define a measure /hg on (M,;S(M)) which assigns to each interval (a,b) the measure 
jUF(a, b] = G{b) — G{a). This is called the Lebesgue-Stieltjes measure of G. Of course, we've 
left out a lot of details — We have to check that jic is countably additive on the algebra of 
left half-open intervals, so that Caratheodory's Extension Theorem applies, etc. But you get 
the idea. 

If F is of bounded variation, then F can be represented as a difference of two increasing 
functions F = G — H. Wc can then define the Lebesgue-Stieltjes integral of a function u{t) 
with respect to F{t) over B e B{R) by 

/ u dF = u djiG ~ ^ djiH 
Jb Jb Jb 

3.4 Stieltjes Integration of Stochastic Processes 

We shall attempt to define the stochastic integral 

r Htioo) dXtico) 
Jo 

for adapted processes Ht,Xt. Of course, we assume that we work in a probability space 
equipped with a filtration (J^t)t, and that Ht,Xt are adapted to this filtration. If Xt is 
(almost surely) of bounded variation on [0,T], and if Ht is a.s. continuous, then it follows 
from the previous section that we can define the stochastic integral pathwise as a random 
variable whose values are Riemann-Stieltjes integral, one for each oj e U. Even if Ht is not 
a.s. continuous, but bounded and adapted, we will be able to define the stochastic integral 
a random variable whose values are Lebesgue-Stieltjes integrals. This means that we have 
now successfully defined Ht dXt in the case where Ht is a.s. bounded and adapted, and 
Xt is a Poisson process (see Remarks below), for example, because Poisson processes are 
increasing, and thus of bounded variation. 

Since Ht{uj) is Riemann-Stieltjes integrable with respect to Xt if and only if Xt is 
Riemann-Stieltjes integrable with respect to Ht ("Integration by Parts") we have a valid 
definition of Jq Ht dXt whenever Ht,Xt are continuous, and one of them is of bounded 
variation. 

For example, by using the integration by parts formula, we can interpret Jq sin(t) dWt{uj) 
to be — Jq Wt{Lo) dsin(i) = — J^ Wt{uj) cos t dt, an ordinary Riemann integral for each cj £ Q. 

Remarks 3.4.1 Recall that a random variable Y is Poisson with parameter a if it takes only 
non-negative integers as values, and if 

P(y = k) = ^e-'^ 

The mean and variance of Y are both easily shown to be equal to a. For example, Y can 
be interpreted as the number of times a certain event has occurred per unit time, where the 
average rate of occurrence is a. 

{Xt : i > 0) is Poisson process provided that 



80 



Stieltjes Integration of Stochastic Processes 



(1) Xo = 

(2) For < s < t < oo, Xt — Xg is a Poisson random variable with mean a{t — s) 

(3) For < to < ti < ■ ■ ■ < tn < oo, the set of random variables {Xt,,^-^ — Xt^. : A; = 1, . . . , n} 
is independent. 

One can think of Xt as the number of times a certain event has occurred by time i, if they occur 
independently at an average rate of a per unit time. Any Poisson process has a version with 
right-continuous paths, and these paths are almost surely constants, except for upward jumps 
of size 1. Only finitely (a.s.) many jumps will occur in any bounded time interval. Closely 
associated with a Poisson process is a family of stopping times, where is time between 
the k^^ and + jump. These are exponentially distributed, with P(Tfc <t) = 1 — e~"*. 

It is easy to see that Mt = Xt — at is & martingale of bounded variation, but Mt is not 
continuous. 

□ 

We can, for example, model a stock price process with jumps using the sum of a Brownian 
motion and a Poisson process, a so-called jump diffusion. 

Let's see what happens if we try to interpret the more interesting integral 

1= f WtdWt 
Jo 

as a Riemann-Stieltjes integral. If the standard rules of calculus apply, we would get 
Taking expectations, we'd therefore get 

m = It 

because Var(VFT') = T. That's what we'd get if the usual rules of calculus appl'^ 

Taking a Riemann-Stieltjes approach, let P = {0 = to < < " " " < *n = be a partition 
of [0,T]. We now make two choices for the t^, G [tk-i,tk]- First, let t^ = tk-i be the leftmost 
point of the interval [tk-i,tk]- We then approximate the stochastic integral by 

k 

where AkW{Lj) = W{tk,uj) — W{tk-i,uj). Taking expectations, we obtain 

E[i] « Y.^[Wt,,,AkW] = Y^nwt,.,] ■ mkW] = o 

k k 

because, by definition of Brownian motion, Wt^_^ is independent of A^VF. 
^But they don't. . . 



Prelude to Stochastic Integration 



81 



Now we make a second choice of tp We choose it to be the rightmost point, i.e. t'^ = tk- 
Note that 

W{tk,uj)Ak{io) = Akiiof + W{tk-i,co)AkW 
If we approximate the stochastic integral by 

I^Y^W{tk,oj)AkW{u;) 

k 

and take expectations, we obtain 

E[7]^^[(A,)2] = ^(t,-t,_i) = r 

E k 

because A^W is N{0,tk — tjk_i)-distributed (by definition of Brownian motion). 

It immediately follows that Wt is (a.s.) not Ricmann-Stieltjes integrable with respect 
to itself, for if it were, the value of I would be independent of our choice of the family t^. 
This kills our naive attempt to interpret stochastic integrals pathwise as Riemann-Stieljes 
integrals. 

Exercise 3.4.2 Show that if we choose t^. to be the midpoint of [tk-i,tk], then we'd get 
E[I] ^ ^T, which is what ordinary calculus predicts. 

□ 

The next exercise proves an important result: 
Exercise 3.4.3 Show that if Mf is an J^j-martingalc, then 

E [{Mt - M,) = E [Mi - M,V^] 

This equation is referred to as the orthogonality of martingale increm,ents bccaTise it depends 
on the fact that E [MgiMt - Ms)\J^s] = 0. This means that Ms and Mt - Mg are orthogonal 
in the Hilbert space C^. 

□ 

The following result hits the final nail in the coffin: 

Proposition 3.4.4 If Xt is a continuous (a.s.) martingale locally of bounded variation on 
[0,T], then Xt is constant a.s. on [0, T]. 

Proof: Without loss of generality, we may assume that Xq = 0, and that the underlying 
filtration is complete. (We can always consider the martingale Xt — Xq if needs be, and 
augment the filtration without changing the martingale property of X.) Let Vx(i, w) be the 
variation of Xt{uj) on [0,t]. Given an arbitrary K > 0, define a stopping time r by 

T = mf{t>0:Vx{t) > K} 

We now show that Yt = X[ = a.s. for all t. It suffices to show that E[l^^] = for all t 
(for if {Yt 7^ 0} has positive measure, then E[yj^] > 0). 



82 



Quadratic Variation and Covariation 



Note that Yj is a continuous martingale, and that < t for all t. Let P = {0 = tp, < 

ti < ■ ■ ■ < tn = t} he a partition of [0, t], and use the orthogonality of martingale increments 
to obtain 

n 

k=l 

n 

k=l 

< VY{t)m&x\Yt^ -ytk-i\ 

< Kmax\Yt^ - l^t^.J 
using the fact that X^jO? < \ ai\) • maxj |aj|. Thus 

k 

As ||P|| — >■ 0, maxfe \Yt^. — i a.s., by continuity of Y. By Dominated Convergence 

Theorem, because max^ \ Yt^, —Yt^,_-^^ \ < 2K, we see that E[maxfe |^(, — i^^.i |] 0, as ||P|| 0. 
so that IE[Vj^] = 0, as required. 

We have now shown that f{Yt = 0) = 1 for all t>0. It follows that 

F{Yq = for all q £ q+) = 1 

Since Yt is continuous, it now follows that 

WiYt = for all i > 0) = 1 

H 

Since standard Brownian motion is a continuous martingale that is not constant, it can- 
not be of bounded variation. In general, therefore, the Ito integral Jq Ht dWt cannot be 
interpreted as a Riemann-Stieltjes integral (unless Ht is itself of bounded variation, in which 
case we can use integration by parts). 

By the way, this could also have been deduced from the fact, stated in the previous 
section (but not proved) , that a function which is locally of bounded variation is necessarily 
differentiable almost everywhere. 

3.5 Quadratic Variation and Covariation 

The fact that the only continuous martingales of bounded variation are constant means that 
we will not be able to define stochastic integrals with respect to continous martingales in the 
Ricmann-Stieltjes or Lebesgue-Stieltjes way. We need to replace the variation by something 
that is finite, and that is quadratic variation. Keep in mind that the series - diverges, 
but ^ converges. Essentially, we define the quadratic variation {X)t of a process Xt to 
be 

n(P) 
fe=l 

where the limit is over successively finer partitions P. That's the intuitive idea, but we will 
need to be a bit more precise. The main aim of this section is to prove the following important 
result: 



Prelude to Stochastic Integration 



83 



Theorem 3.5.1 Let Mt be a continuous local martingale. Then there is a unique continuous 
increasing process Qt with Qo = such that 

Mf - Qt 

is a continuous local martingale. Moreover, if M E cMq, then — Qt is a UI martingale. 
The process Qt is called the quadratic variation process or variance process of Mt and is 
denoted by {M)t. 

H 

Please note the fohowing easily verified identity (the "Summation By Parts" formula): 

n n n 

XnVn - Xoyo = ^Xk-l{yk - Vk-l) + ^yk-l{Xk - Xk-l) + "^{xk - Xk-l){yk - Vk-l) 
fc=l k=l k=l 

This formula is often applied with yk = Xk'- 

n n 
Xn-xl = 2'^Xk-l{xk - Xk-l) + ^{xk - Xk-lf 
k=l k=l 

Consider now a partition tt = {0 = to < ii < ^2 < • • • } of the non-negative reals, with t„ t co- 
Define a function k : M"*" — >■ N by k{t) = sup{A; :tk <t}. Further define a process Q^{M) by 

fe(t) 

QUM) = J2iMt, - Mt,_,f + {Mt - Mt,^^^f 
t=o 

As the partition k gets finer and finer, QJ{M){uj) converges to the quadratic variation of 
the sample path t i-^ Mt{Lo) (provided it exists). Moreover, we have the following very nice 
property: 

Proposition 3.5.2 If Mt is a bounded continuous martingale, then so is Mf — Qf{M). 

Proof: Let Qt = Qi{M). It is clear that Qt is continuous if Mt is, and thus Mf — Qt is 
continuous. Note that if r < s < t, then 

E[(Mt - Mrf\Ts] = E[((Mt - Ms) + {Ms - Mr)f\Ts] 

= E[(Mt - Msf\Ts] + [Ms - Mrf 
= E[m2 - M,Vs] + {Ms - Mr)^ 

Now clearly 

kit) 

Qt-Qs= Yl iMt,-Mt,_,)' + {Mt-Mt,^^^)^-{Ms-Mt,^^^)' 

k=k{s)+l 

kit) 

= iMt,,,,+, - Mt,j' - {Ms - Mt,^^/ + (Mt, - Mt,_,)' + {Mt - Mt,^^^)' 

k=kis)+2 



84 



Quadratic Variation and Covariation 



Now take conditional expectations, using the formula at the beginning of the proof, with 
r < s <t given by r = tk(s) < s <t = tk(s)+i- 



E 



[Qt - QslJ's] = IE[m2(,)+, - M^\Ts] + {Ms - Mt,J^ 

kit) 

-{Ms-Mt,J^+ Yl nMl-MljTs]+E[M?-MljTs\ 

k=k{s)+2 

= E[M^ - I J-g] 



We now see that 



= M2 - Qs 



Why does this not prove Theorem 3.5.1? The problem is that Qi{M) may not be an 
increasing function (in t). However, it is clear that if the partition vr is finer than the partition 
A, then QJ{M) > Q^{M). In some texts, it is now shown that the QJ{M) converge in 
probability to some increasing continuous process (as the tt's get finer), and this process is 
then called the quadratic variation. We shall take a different tack. 

Given that M G A^qj define for each n G N a sequence stopping times by 

To" = Tfc^i = inf{t >T^ ■.\Mt- Mt^ \ > 2""} 
Further define = t A T^. Put 

fe>i 

We shall show that the processes converge uniformly a.s. to an increasing continuous 
process Qt and that — Qt is a UI martingale. For this, we require the following two 
propositions: 

Proposition 3.5.3 Let Mt be a UI martingale, and let a < t he stopping times. Suppose 
that Z is a hounded J- a -measurable random variable. Define Nt = Z{MrAt — M^At)- Then 
Nt is a UI martingale. Moreover, if Mt G Mq, then Nt £ Mq as well. 

Proof: We first deal with the UI case: By Theorem 2.2.14, it suffices to show that for every 
stopping time p < oo we have 

E|iVp| < oo ENp = 

If c = sup^^ |'2^('^)| < oo, then 

E\Np\ < c{E[\MrAp\ + \M^Ap\]) 

But E[|Moo||J>Ap] ^ |IE[Moo|-7>Ap]| = l-^'^rApl by the Optional Sampling Theorem, so that 
E|Mt-apI < E|Moo| < oo. A similar argument shows that E|Mo-Ap| < oo as well, and we 
conclude that EjA'pl < oo for all stopping times p. 



Prelude to Stochastic Integration 



85 



Next we must show that E,Np = 0. For this, we apply the Monotone Class Theorem 
(Appendix A. 2). Let A G Jv, and first consider the case Z = I a- Define random times ta, cta 
by 

t{u) if a; G ^ 
+00 else 

and similarly for a a- Then ta is a stopping time: Since A G C J'y, we must have 
A D {t < t} = {ta ^ t} £ J^t for all t. In the same way, it can be shown that aA is a 
stopping time. By the Optional Sampling Theorem, E[Mt-^/\p] = E[Mo-^/\p]. Since Np = 
Ia{Mt-/\p — Ma-Ap) = Mr^Ap — M„j^/\p-, the result follows for indicator Z. 

Now let T-Lp = {Z : Z is Jv-measurable, E2'(Mt-/\p — M^^p) = 0}. Then, invoking the 
Bounded Convergence Theorem, T-Lp satisfies the hypotheses of the Monotone Class Theorem, 
and I A £ 'Hp for each ^ G Jv. Hence every bounded Jv-measurable random variable belongs 
to Hp. 

Next suppose that Mt G M.'q. Prom the aforegoing, it is obvious that Nt is a UI martingale, 
null at zero. Moreover, 

E[iV2] < c2E[(M^m - M^At?] 

= c2E[E[M2^i - 2MrAtM^At + M^^t\T^At]] 

by the Optional Sampling Theorem applied to the UI martingale Mf. Hence E[iVj^] < 
c2E[M^] < DO, i.e. Nt G M^. 

H 

Proposition 3.5.4 Let M G Mq, let < tq < ti < . . . be stopping times, and let Zk be 
Tr,. -measurable random variables with \Z]^\ < c for each k. Let N be a bounded process of the 
form 

k>l 

Then N G M^, and \\N\\2 < c\\M\\2. 

Proof: Consider first the following processes, with finitely many terms each: 

n 

Nl^ = ^Zk-i{Mr,M-Mr,_,At) 

k=l 

By the previous proposition, each term Zk-i{MTAt — MT^_^At) belongs to A^g, and hence so 
does each TV" (because M."^ is a vector space). Indeed 



k=l 
n 

<c''J2E[{Mr,At-Mr,_,Atf] 



k=l 
n 



= c'j2nMiAt-M^^_^At] 

k=l 

< c'E[M^] 



86 



Quadratic Variation and Covariation 



so that \\N'^\\2 < c||M||2. 

Now fix t. Clearly Nt a.s. as ri ^ cxd. By the aforegoing, {N^ : n G N} is 

bounded in C'^, and thus UI. We thus have N^" — >■ Nf in as well (by Theorem 1.6.8). Thus 
^Np\Ts] ^Nt\Ts] in for s < t. However, also ^N^\Ts] = Ns in jC^. It follows 

that E[iVt|J^5] = Ng a.s., i.e. that Nt is a martingale. By Fatou's Lemma, 

E[Nf] <t limE[(iVf )2] < (^E[M^] 

n 

which shows that N e with ||A^||2 < c||M||2. 

H 

Proof of Theorem 3.3.1: First assume that M is a bounded martingale null at zero. 
Proving uniqueness is easy: If — Qt and — Q[ are both continuous martingales, where 
QtiQ't are increasing continuous processes null at zero, then the difference 

(M2 - Q[) - (m2 - Q,) = Qt-Q't 

is a martingale, and, moreover, locally of bounded variation (being a difference of two in- 
creasing processes). Thus Qt — Q't is constant, and since Qo = Qq = 0, we have Qt = Q'f 
Next we worry about existence: Define for each n G N a sequence stopping times by 

To" = T,\i = inf >n:\Mt- Mt^ \ > 2""} 

Further define tl = tATl\ Put 

fe>l 

By the summation by parts formula 

Mf =2j2Mtn_SMtn - MtnJ + ^(Mtn - Mtn_j2 
k>l k>l 

= 2K + Qt 

where the sum in the first term defines A^". Note that by the previous proposition, N'^ € Mq 
for each n, and thus Mt — Qt is a martingale, for each n. 

We shall now show that the processes converge uniformly a.s. Note that the form 
successively finer partitions of (0, t] as n increases, and thus that each t^"^ is equal to some 
t^. Define 

= max{t«-i : ff^ < t^} 

so that .s^' is the biggest t"^^ which lies below t^. Note that the are not necessarily distinct. 
Now fix j and consider the term Mjn ^ (Mt^ — Mtj ^ ) . Also consider the set 

rj.n+1 . n+l _ j.n i _ rj.n+1 j.n+1 j.n+1 i 

t'-fe ■ ^k — '-j-l/ — f-m ) (-m+l' • • • ' ''m+i-lJ 

i.e. the set of all such that < < t]. Note that t'^.^i = f^. Then Mtn_^ = 
M ™+i = M^n+i = • • • = , and so 

Mtn , (Mtn - Mtn ) = M „+l (M,„+l - M.n+1 ) 

A;=m+1 



Prelude to Stochastic Integration 



87 



It follows that 

= Mtn (Mtn - Mtn ) = V M „ + l (M,n + 1 - M,„+l ) 

j>\ k>l 

and thus 

fe>l 

which is of the form 

fc>l 

where each Zh-i = (M,n+i — M^n+i) is J^.n+i -measurable. Now clearly each \Zk-i\ < 2 • 

'^k-i ^k-i 

2-(n+i) _ 2-"^ and thus by Proposition 3.3.4, each A/'"+^ — A?^" is a martingale in cA^g "'^ith 

ll^n+l _^n||^ < 2-"||M||2 

It follows that the sequence iV" is a Cauchy sequence in cMq, and thus it converges a.s. 
uniformly to some martingale A'" G cA4q. It follows that the processes converges a.s. 
uniformly to some process Qt- Then = 2Nt + Qt- Also, Qt is continuous, being the 
uniform limit of the continuous Q". 

Now it is clear that even though the process may not be increasing, we certainly have 
Qj^n < Qj^n for each k. By definition of Q we thus see that Qt" < Qt" (for each n and 

k fc+1 ''^ fc+1 

k). For each a; G define a countable subset J{oj) of M"*" by 

Jn(a;) = {rfc"(a;) : G N} J(w) = (J Jn(u;) 

n 

Then each 74"(w) is increasing on Jn(w), and thus A is increasing on J(a;) (and thus on the 

closure of J (to). Now suppose that the interval / is disjoint from J{uj). Then no belongs 
to /. Since M{oj) is continuous (a.s.) it follows that M{uj) is constant on /. But then each 
Q^{lo) is constant on / as well, and hence Q{uj) is constant on /. Since "constant" is a special 
case of "increasing", we see that Q is (a.s.) increasing. 

We have now proved that quadratic variation exists for bounded continous martingales. 
To deal with the general case, let M be an arbitrary continuous local martingale (null at 
zero) , and choose a localizing sequence of stopping times r„ such that each stopped process 
M"^" is a bounded martingale. By what we have just proved, there exists a unique increasing 
continuous process such that M^^^^ — Qf is a martingale for each n. However, uniqueness 
of Q"^ ensures that Qt^^iuj) = Qti^) whenever t < Tn- Thus we can define Qt{oj) by 

Qt{oj) = Qt'i^) where n is such that t < r„(a;) 

(Such n exists because r„ t co.) Then Q is clearly continuous and increasing. Moreover, the 
stopped process 

(Mf - QtY- = Ml^t - Qr„M 
is a martingale by construction, and thus — is a local martingale. 



88 



Quadratic Variation and Covariation 



Different texts will give different definitions of quadratic variation. The advantage of the 
above construction is that it is defined pathwise, and thus we will be able to prove several 
a.s. results, whereas many of the other definitions will only yield results that are true in 
probability. One immediate consequence of the pathwise definition is that the quadratic 
variation does not depend on the probability measure. Implicitly, most of our definitions are 
a.s., so if P, Q are equivalent probability measures, then the quadratic variation of a stochastic 
process X under P is the same as its quadratic variation under Q. 



Chapter 4 

Outline of Stochastic Integration 



4.1 L^— Theory of the Stochastic Integral 

Throughout, let M,N G cA^g be continuous square-integrable martingales starting at 0. 



4.1.1 Basic Integrands 

A basic predictable process is analogous to a buy-and-hold strategy H := C^t^^t^y. Buy C- 
many shares M at time t\ , and sell them at t2 ■ The amount bought at time ti must be known 
at time ti, i.e. C is a bounded -measurable RV. The gain Gt at time t is then 



Gf := i 





C{Mt-Mt,) 
[C{Mt,-Mt,) 



if t < ti 

iiti<t<t2 

ift2<t 



i.e. Gt := C{Mt^M - Mt^^) 



We define 



{H . M)t 



C{Mr 



At 



Proposition 4.1.1 H»M e cM^. 



Proof: The gist: Let r be a stopping time. Then E[{H • M)^] = E[C{Mt,,Ar - Mt^;,r)] = 
E[CE[Ml^ - Ml^lJ'ti]] = 0, as is a martingale. It follows that {H • M) is a UI martingale. 

Furthermore, by orthogonality of martingale increments, E[(i?«)^] = E[C'^{Mf^ —M^^)] = 
E[C'^E[{Mt,- Mt,)\Tt,]] =E{C^E[Ml-Ml\Tt,]] = E[C'^ {M^ - Ml)]] < oo as C is bounded. 
Hence H • M is square-integrable. 



4.1.2 Simple Integrands 

Now consider a more dynamic trading strategy H. Trade at dates < to < *i < • • • < 
only, to get 

n 

i7 := ^ Ck-il(^tk-utk] where Ck-i G Tt^.^ is bounded 
fe=i 



89 



90 



Outline Stochastic Integration 



Such a linear combination of basic integrands is called a simple predictable process. Extend 
the integral from basic to simple integrands by demanding that linearity be preserved, i.e. 
define 

.t n k(t) 

{H.M)t ^ Hs dMs ■.= Y,Ck-iiMt,M-Mt,_,M) = Y,Ck-i{Mt,-Mt,_,)+Cki€){Mt-Mt^^^^ 
-'^ k=i k=i 

wheret k{t) := maxj/c : tk < t} 
Proposition 4.1.2 H • M £ cM^. 



Proof: The gist: By Propn. 4.1.1 each Cfc_i(M(j.At — Mtj._-^ At) is a martingale in cA^q. Since 



• M is a sum of these martingales, it too is G cM.q. 



Observe that the set S of simple predictable processes is a vector space. Furthermore, the 
integral is a linera operator, i.e. if H, K are simple predictable, and a, /? G M, then 

{aH + I3K) •M = a{H • M) + f3{K • M) 



I.e. 



[ aHs + (3Ks dMs = a [ Hs dMs + ^ [ Ks dMs 
Jo Jo Jo 



Proposition 4.1.3 If H is a simple predictable process, then 



\H.M\\l,,=E 



Hi d[M]s 



Comment: The integral on the right is (for each outcome a; G r?) an ordinary Riemann- 
Stieltjes (or Lebesgue-Stieltjes) integral, as the process [M]t is increasing, and thus of locally 
finite variation. 

Proof: \\H . M\\^. := E[(i7 . M)^] = e( ELi Ck-i{Mt, - M^,. j)'. Now 



Y,Ck-i{Mt,-Mt,_,)) =Y,CLi{Mt,-Mt,_,f+2Y,Y.^k-iC,-i{Mt,-Mt,_,){Mt-Mt^ 

k=l j<k 



k=l 



k=l 



By orthogonality of martingale increments and the tower property. 



E 



E 



Y,cLAMt,-Mt,_,f 

k=l 
n 



+ 2E 



+ 



^Y.^k-iC,-i{Mt,-Mt,,,)iMt^-Mt^_,] 

k=l j<k 



Outline Stochastic Integration 



91 



Now M2-[M]i is a UI martingale, and hence EfM^ -M^ _J^t,_ J = E[[M]t,- [M]t,_ J J. 
It follows that 



E 



.k=l 



E 



E 



= E 



It 

.k=l 

POO 

/•oo 

/ HfdiM] 
Jo 



4.1.3 The space L'^{M) 

Given an M G cMq, define a function 



\l^(^m) oil the set of predictable processes by 



This looks a bit like an L^-norm. 
Let 

L'^{M) := |set of all predictable processes H with \\H\\i2(^{^.f) < oo| 

(where two processes are regarded as equal if they are indistinguishable). 
Here are some facts: 

• II • lliz^jy^) makes L^(M) into a normed vector space. 

• Every simple predictable process belongs to L^(M). 

• The set 5 of simple predictable processes is dense in L^(M), i.e. if if G L^(M) then 
there is a sequence (if of simple predictable processes such that if^"^ ^ if in 
L2(M), i.e. ||if(") -if|| 

L2(M) — >■ as n — >■ DO. 
Further observe that we have the following isometry: 
Corollary 4.1.4 (Ito Isometry) If H G S is a simple predictable process, then 

\\H\\lHm) = ll^^-M||^2 

4.1.4 The Stochastic Integral on L'^{M) 

We now use the Ito isometry and the fact that S is dense in iF'^M) to lift the stochastic 
integral from S to all of I?{M). The basic idea is as follows: Let H G iP'^M). We want to 
define H • M. 

• Choose a sequence of simple predictable processes if^"^ G 5 so that ||ii(") — ii||i2(jvf) — >■ 
0. 



92 



Outline Stochastic Integration 



• Then the sequence H^'^^ is a Cauchy sequence in L'^{M) (since any convergent sequence 
is a Cauchy sequence), i.e. 

L2(M) as n,m— ^00 

• Now the stochastic integral has already been defined for simple predictable processes, 
i.e. • M has already been defined, for each n, and each H^""^ "Ms cA^q. 

• Furthermore, • M - • M||_yv,2 = - • M||_yK2 = - 
^^""-•llL^CAf)- Thus • M)„ is a Cauchy sequence cM^. 

• But cMl is a Hilbert space, and hence complete. Thus the Cauchy sequence {H^"^^ •M)n 
converges to some martingale N S cMq. 

• Now we define H • M to he that limit N. 

• Automatically we have H • M ^ cA^g. 

Proposition 4.1.5 (Ito Isometry If H € L'^{M), then ||ff||i2(jv/) = \\H •M\\j^2. 

Proof: Choose G 5 so that ^ H in L^{M). It follows that \\H^'''^\\l2i^m) ^ 

II^IIl2(mJ3 Similarly, since //(")«Af ^ F.M in cA^g, we have ||i?(").M||;K2 ^ ||i?«M||;K2. 
The Ito isometry for simple predictable processes yields that ||-ff^"'^||L2(A^) = Hi/^") • M||_^2 
for each n. Taking limits, we see that 

\\\H\\l2(^,j) =lim\\H^^^\\L2^M) =^^\\H^^ 

H 

A similar argument shows that the stochastic integral is linear. 

The following fact is sometimes useful, and easy to believe. It says that stopping the 
stochastic integral {H»M) at a stopping time r is equivalent to either stopping the martingale 
M at time r (so that dM^ = for t > t), or setting the integrand H to zero after time r. 

Proposition 4.1.6 If t is a stopping time, then 

{H • MY = H»M^ = HI(^o^^] • M 

i.e. 

rTf\t rt i-t 

/ HsdMs= / HsdMl= / HsI(^^^r]dMs 
Jo Jo JO 

Proof: We will prove that (H • MY = H • M^: If is a simple predictable process, 

H ■■=T.kCk-iI(t^_^,tk], then 

{H.MY={Y.^k-YMt,;,t-Mt,_,;,ty 

k 

k 

= Y,Ck-iiMlM-Ml_^,,) 

k 

= {H*M^)t 

""^Observe that if x,-,, — )■ a; in a normed space, then \xn — x\\ — >■ 0, and since < j||a:^n|| — ||2;||j < ||a::„ — x\\, 
we have \ \xn\ \ — > 1 1 a; 1 1 . 



Outline Stochastic Integration 



93 



which proves the resuh if H is simple. 

Now let H G L^{M). Choose a sequence H^"^^ of simple predictable processes such that 
||ii"W-i?||i2(M) ^ 0. Then also ||ijW»M-iJ«M||^2 0. Now by the Optional Sampling 
Theorem and Jensen's Inequality, 

WiH^^'K MY - {H • My\\%l2 =E (^{H • M)r - {H • M)ry 



= E 
< E 



E[(ii-(")«M)oo-(^«M)oo|-^r]' 



((i/(-).M)oo-(i?.M)oo)' 



= I •M-ii" .Ml 1^2 ^ 

Thus • My {H» My in cA^^. However, • M)^ = • M^), so also 

(ijW • M)"^ {H» M^). Uniqueness of hmits shows (i? • My = {H • M"^), as required. 



4.1.5 An Example 



This example contains some heuristic calculations, and is not watertight. Consider the 
stochastic integral Jq Bg dBg. Here the integrand is i7 = BI{0,t], and the integrator M 
is B. Recall that [B]t = t. 

Note that Brownian motion is not a square-integrable martingale, since sup^>Q E[i3^] = oo. 
However, if t > is fixed, then the stopped Brownian motion 5* (defined by := Bg^t is 
a square-integrable martingale, as sup5>o E[(B*)^] = E[Sj] = t < oo. Then Jq Hg dBg = 
Jq°° Hs dBj. as dBl = for s > t. Hence the theory developed so far applies. 

Let P = = to < ti < t2 < ■ ■ ■ < tn(p) = t he a partition of [0,t], and define simple 
predictable processes 

n(P) 
k=l 



Recall that the mesh (t(P) of the partition P is defined by o'(P) := max{|ifc 
l,...,n(P)}. 
Now 



tk-i\ ■ k 



.L^{M) 



E 



n(P) 

E 

k= 



fe=l 

fife 



*fc-l 



BfY dt 



1 Jtk-l 



n{P) 



k=l 

Now as (j{P) — T- 0, we have Ylk^ii^k ~ ^fc-i)^ ~^ [t]^ the quadratic variation of the function 
f{t) = t, which is zero. Hence H^^^ ^ Hin L^{M) as o-(P) 0, and hence H^^^^B H»B 
in c7W2_ It follows that Efel? Bt,_,{Bt, - ^ £ i/j d^*. 



94 



Outline Stochastic Integration 



Now note that b{a - b) = ^(a + b){a - b) - \{a - bf = ^(o^ - fe^) - \{a - bf. With 
a := Bt^ and b := Bt^_-^, we get 

n(P) n{P) n{P) 

k=l k=l k=l 

•« ' •« 



(!) (11) 

The sum (I) is telescoping, and sums to B^^^^^ — Bf^ = B^ — B^ = B^. 
The sum (II) converges (in probabihty) to the quadratic variation [B]t = t (as o"(P) — >■ 0). 
Hence Yl^^i Btp,_i{Bt^. — Bt^_^) converges (in probability) to \{B^ — t). Since it also 
converges to Jq Bg dBg (in L^, and hence in probability), we see that 

f BtdBt = l{B^ -t) 
Jo 

We already knew that B^ — i is a martingale. 



4.1.6 Approximation 

Suppose that if is a left-continuous adapted process, and that P = = to<ti<---< 
in{P) =t is a partition of [0, t]. For s G [0, t], ket k{s) := maxjfc : t^ < s}. Then as cr(P) 0, 
we have t^i^g-^ t Since H is left-continuous, we see that Ht,^^^^ Hg a.s. Thus if we define 
the simple predictable process H^^^ by 

n(P) 

ff{P) ^ Ht^_J^jtk-utk\ 
k=l 

then hP Hg a.s. as cr(P) — >■ 0, for all s G [0,t]. Thus the simple predictable processes 
H^^^ form better and better approximations to H. 

Now note that hP dMg = YZ=i Ht,_,{Mt, - Mt,_,) 

The following fact is therefore hopefully not too hard to believe: 

Proposition 4.1.7 If H e L'^{M) is left-continuous, and P = = to<ti<---< t^^p) = t 
is a partition of [0, t] . Then 

Ht,_AMt, - Mt,_,) ^ / HgdMg 

k=i 

as a{P) 0. 

This represents the stochastic integral as a Imit of left-hand Riemann-Stieltjes sums. 

We now have two approximations that wc will use repeatedly. Given partitions: P = = 
to <ti < ■ ■■ < tn{p) = i of [0, t], with a(P) 0. 



Outline Stochastic Integration 



95 



1. The quadratic covariation [M,N] of two martingales can be approximated by sums: 

n{P) 

[M, N]t ^ Yl - Mt,.,m, - Nt,_,) = J2^kM- AkN as a{P) ^ 

k=l k 

In particular, when M = N,we have [M]t = [M, M]t ^ X^j.(AfeM)2. 
We thus have: 

Afc [M, iV] « AfeM • AkN Ak [M] « (AfeM)^ 

2. The stochastic integral can be approximated by left-hand Riemann-Stieltjes sums 

We thus have: 

Afe(iJ.M) f^i^^,_JAjkM 
4.1.7 Quadratic Variation and Covariation of Stochastic Integrals 

What is the quadratic variation of H • Ml 

This (and more) is answered by the following proposition: 

Proposition 4.1.8 Let M,N e cMl, and H G L^M),K e L'^{N). Then 

[H*M,K*N]t= [ HsKsd[M,N]s 
Jo 

Proof: Some rather complicated analysis shows that the following approximations are OK: 

[H • M,K • N]t!vY ^k{H • M) ■ Ak{K • N) 
k 

« Y Ht,_,Kt,_,AkMAkN since Ak{H . M) « Ht,_,AkM 

k 

^ Y Ht,_,Kt,_, Afc[M, TV] since A^M • A^N « Afe[M, N] 

k 

« f HsKsd[M,N]s 
Jo 

H 

4.1.8 The Associative Law 

Suppose that M G cMq and that H G L'^{M). Since i7 • M is again a martingale in cA^q, 
we may integrate with respect to it, i.e. given another predictable process K, we may ask: 

What is • (i7 • M)? 



96 



Outline Stochastic Integration 



Proposition 4.1.9 

K»{H»M) = {KH) • M 

Proof: Some more complicated analysis shows that the following approximations are OK: 

{K •{H • M))t « ^ Kt^_^/^k{H • M) Lefthand Riemann-Stieltjes sum 
k 

-^Kt,_,Ht^_^AkM since Ak{H • M) ^ Ht^_^AkM 
k 

« f KsHsdMs 
Jo 

4.2 Integration w.r.t. Semimartingales 
4.2.1 Continuous Local Martingales 

A stochastic process M is said to be a continuous local martingale if and only if there is a 
sequence of stopping times r„ f oo a.s. such that each stopped process M"^" is a continuous 
martingale. Such a sequence of stopping times is called a localizing sequence for M. 

If Tn is a localizing sequence for M, then Tn/\t ^ t as n ^ oo, and hence M["- := Mr„At 
Mt a.s. Observe that if Tn < Tm then = M^'^ for t < Tn- By choosing r„ intelligently, 
we can ensure that the stopped processes M"^" are "nice". Then, by taking limits N ^ oo, 
, we can lift results from "nice" martingales to continuous local martingales. This process is 
called localization. 

Observe: 

• Since stopped martingales are martingales, every martingale is a local martingale. The 
converse is not true. 

• By choosing Tn intelligently (e.g. r„ := inf{t : \Mt\ = n}, we can ensure that each M"^" 
is a bounded martingale, and hence square-integrable. 

• If each M"^" is a square-integrable continuous martingale, then the quadratic variations 
[M^"]t and stochastic integrals H • M"^" have already been defined. 

• If each iW"^" G cMq, then each M"^" has a quadratic variation process [Af^^Jt. We may 

then define the quadratic variation [M] of the continuous local martingale M by taking 
Hmits: [M]t = lim„[M^"]t. Since M/"" = Ml"' when t < r„ < Tm, it is easy to see that 

• Now {M^ - [M]ty" = {Mj^f - [M][" = {Ml"f = [M^"]t is a continuous martingale. 
Hence Mf — {M\t = \\m.n{Mf — \M]tY" is a continuous local martingale. Furthermore, 
the quadratic variation [M\t is the unique increasing process with [M]o = such that 
Ml — \M\t is a continuous local martingale. (However, Mf — \M\t need not be a 
martingale if M is a local martingale: That require s M to be a square-integrable 
martingale.) 



Outline Stochastic Integration 



97 



• We may therefore define H • M as a limit of the integrals H • iW^" . This means that 
H • M is a Hmit of martingales, and hence itself a local martingale. [One needs the 
fact that {H»My^ = (iJ. M^"), i.e. that H, dMs = dMJ", which seems 
obvious.] 

• We can also extend our set of integrands: Instead of requiring that H is predictable 
with E[/q°° d[M]s] < oo, it suffices that H be predictable with d[M]s < oo a.s. 
for alH > 0. But then the Ito isometry need not hold, as ||-ff||L2(M) ™£^y be infinite. 

• This extended integral thus defined will inherit any of the properties of the L^-integral 
which are stable under stopping and taking limits. In particular, it will be linear. 

For the remainder of this course, we will pay scant attention to local martingales, and 
pretend that a local martingale is a martingale, hoping that any problems can be ironed out 
by localization. 

4.2.2 Continuous Seminicirtingales 

We know how to integrate w.r.t. a process A which is locally of finite variation: {H • A)t = 

[q Hs dAg is simply a Riemann-Stieltjes integral (for each w € O). We are therefore now able 
to integrate w.r.t. continuous local martingales and w.r.t. processes of finite variation. 

Definition 4.2.1 An adapted cadlag process Xt is said to be a continuous semimartingale 
if and only if it has a decomposition 

Xt = Xo + Mt + At 

where M is a continuous local martingale with Mq = 0, and A is an adapted finite variation 
process with = 0. 

Observe that every continuous local martingale and every finite variation process is a 
semimartingale. Moreover, the class of semimartingales is a vector space. 

Now integration w.r.t. a semimartingale is easy: If H is predictable, and X = Xq + M + A 
is a semimartingale, then we define 

t rt rt 

Hs dXs = HsdMs+ Hs dAs 
Jo Jo 

Proposition 4.2.2 // Xt = Xq + Mt + At is a continuous semimartingale, and H is pre- 
dictable, then H • X is a semimartingale. 

Proof: We have H • X = H • M + H • A, by definition. We know that if M is a local 
martingale, then H • M \s a local martingale. It therefore suffices to prove that H • A\s oi 
finite variation, and for that it suffices to show that H • A \s a ifference of two increasing 
processes. 

Now clearly if H is non-negative and A is increasing, then H • A\s increasing also. Now 
since A is of finite variation, it is the difference of two increasing processes A = A^'^ — A^^\ 
Furthermore, Ht = — , where = max{ifj,0} and Hf = max{— i7t,0}. It follows 
that 

H*A = {H+ -H-)* (A(^) - ^(2)) = (-^+ . ^(1) + ^(2)) _ (^+ . ^(2) +H- • 



98 



Outline Stochastic Integration 



is a difference of two increasing processes, as H~^,H are non— negative, and A^^\A^'^^ are 
increasing. Hence if • ^ is of finite variation, and thus 

H»X = H»M + H»A 



is a semimartingale decomposition of H • M. 



If X is a semimartingale, then the quadratic variation \X\t can be defined as a hmit (in 

probabiHty) of Y^Si^t, - Xt,_,f = Eki^kX)^ as ^ 0. 

Similarly, the quadratic covariation of two continuous semimartingales can be defined by 

n{P) 

t^i k 

Now recall that a continuous finite variation process has zero quadratic variation. We can 
say more: If X is a continuous semimartingale, and if A is continuous of finite variation, then 
[X, A]t = for all t. Indeed 



[X,A]t 



^Ak-XAkA 



<max{|AfeX|}-^|Afc^| 



Now as cr{P) — )■ 0, we have maxfc{|AfcX|} — )■ 0, as X is continuous, whereas |AfcA| 
converges to the variation Va_[0, t] of A over [0, t], which is finite. Thus [X, A]t = 0-Ya[0. t] = 0. 

In particular, if X, Y are continuous semimartingales with decompositions Xf = Xq + 
Mt + At and Yt = YQ + Nt + Bt, then 

[X,Y] = [M + A,N + B] = [M,N] + [M,B] + [N,A] + [A,B] = [M,N] 

i.e. the quadratic covariation of two semimartingales equals the quadratic covariation of their 
local martingale parts. 



Note: It is not true that Xf — [X]t is a martingale when X is a semimartingale. That holds 
when X is a continuous square-integrable martingale. 



It can be proved that semimartingales are the "most general integrators" — This is the 
Bichteler-Dellacherie Theorem, and you can go and find out what "most general" means 
yourself. Interestingly, it can also be proved that if a market model has no simple predictable 
arbitrage opportunities, then the asset prices must be semimartingales — This is a result of 
Delbaen and Schachermayer. 

One common class of semimartingales in mathematical finance is the class of ltd processes: 
A semimartingale X is an Ito process if it has decomposition 

Xt = Xo+ f Hs dWs + f Ks ds 
Jo Jo 

for predictable H and adapted K. 



Outline Stochastic Integration 



99 



4.2.3 Approximation 

Let X, Y be two continuous semimartingales, with decompositions Xt = Xq + Mf + At and 
Yt = Yo + + Bt, and let H,K be predictable processes: As in the L^-case, we have two 
approximations: 

1. [X, Y]t «i Efe ^kX ■ 6kY. Since [X, Y] = [M, N], this yields 

Ak[X, Y] K AkX ■ AkY = AfeM • A^iV 

and thus 

Ak[X] {AkXf 

2. Hs dX, = Hs dMs + Hs dAs « Efe Ht,_,AkM + Ek Ht,.AkA = Ek Ht,.AkX. 
Thus 

Ak{H • X) ^ Ht^_,AkX 

As these were the only things used, it follows exactly as for the L^-case that 

• (Quadratic Covariation) [H»X,K»Y]t = Jq HsK, d[X,Y]t, i.e. that [H»X,K»Y] = 
HK» [X,Y]. 

• (Associative Law) K • {H • X) = (KH) • X. 

4.3 The It 6 Formula 

Theorem 4.3.1 (One— dimensional Ito Formula) Suppose that Xt is a continuous semi- 
martingale and that f G C^(M) is a twice-continuously differentiable function from M to M. 
Then f{Xt) is a continuous semimartingale, with decomposition 

f{Xt) = f{Xo) + r f'iXs) dXs + \ f f"{Xs) d[X]s 
Jo ^ Jo 

Proof: Here's another heuristic proof involving approximations that can be justified by some 
complicated analysis. Recall Taylor's Theorem: If g is C^, then g{t + h) = g{t) + g'{t)h + 

y{t)h^ + o{h^). 

As ususal, conside a partition P = = to < ti < ■ ■ ■ < t^^p) = t oi [0,t]. Using the Taylor 
approximation, we have 

fi^tk) ~ f{Xtk-i) + f'{Xt^-i) ■ {Xtk - Xt^-i) + 2-f"(-^tk-i) ■ i^t^^ = ^tfc-i)^ 
= f{Xt,_,) + f'iXt^_^)A,X + lf"{Xt,_,){AkXf 
Now (AfcX)2 ^ Ak[X], so 

f{Xt) - f{Xo) = - /(X,,_J) « ^/'(X,,_JAfeX + i ^ f"{Xt,_,)Ak[X] 

k k k 

Now as cr{P) —5- 0, the first sum converges to f'ixg) dXg and the second sum converges to 

nx,) d[x\s. 



100 



Outline Stochastic Integration 



H 

Now if / : — )• M is twice-continuously differentiable, then 

df 1 f 

f{xi+Axi, . . .,Xm+AXm)) ~ f{xi, . . .,Xm)+^ 5^^^^' ' ' ■^^rn)Axi + - ^ ^^^^ AxiAxj 

i=l * ^<i,j<m ' 

Approximating as above, we get the following theorem. 

Theorem 4.3.2 Suppose that X = . . . is an m-tuple of continuous semimartin- 

gales, and that f : — > R has continuous partial derivatives of second order. Then 
f{Xl, . . -X^^) is a semimartingale, with decomposition 

Example 4.3.3 Let X = {Bt,t), where i? is a Brownian motion, and consider f{x,t) = 
x^ — t. Then by the Ito formula — and using the fact that t is of finite variation, so that 
[B,t]=0 = 0= [t, t], we see that 

Bl-t = f{But) = f{Bo,Q) + ^{Bs,s)dBs + -^{Bs,s) ds + --^{Bs,s) d[B,B]s 
Now [B, B]t = [B]t = t, so we get 

B^-t= f 2BsdBs- f lds + \ f 2 ds = 2 f Bg dBg 
Jo Jo 2 Jq Jq 

And hence ^ 

/ BtdBt = \{Bl-t) 
Jo 

as we have already seen. 

□ 

4.4 Differential Notation 

Now everything becomes simple. . . 
• We introduce two abbreviations. 

(i) Let 

dYt = Ht dXt be shorthand for Yt 
i.e. d{H • X)t = Ht dXt. 

(ii) Further, let 

dXt dYt be shorthand for d[X, Y]t 



= Yo+ [ Hs dXs 
Jo 



Outline Stochastic Integration 



101 



Note that dXt dAt = if ^4 is of bounded variation. Thus for any semimartingales 
X, Y, Z we have dXt {dYt dZt) = (because \Y, Z]t is of bounded variation). 

• The distributive laws become 

{Ht + Kt) dXt = Ht dXt + Kt dXt Ht d{X + Y)t = Ht dXt + Ht dYt 
This looks obvious. 

• The associative law states that ifY = K»X, then H •Y = HK • X. In differential 
notation Ht dYt = HtKt dXt, i.e. 

Ht{Kt dXt) = {HtKt) dXt 

This looks obvious. 

• The covariation of stochastic integrals: d[H • X,K • Y]t is (by abbreviation (ii)) 
d{H •X)t d{K •Y)t. But By abbreviation (i), this can be written as {Ht dXt){Kt dYt). 
This looks like it ought to be equal to HtKt dXt dYt. 

However: In abbreviation (ii), dXt dYt is defined as a single object. In the expression 
HtKt dXt dYt, the dXt and dYt were obtained separately from abbreviation (i). 
But the covariation of stochastic integrals is given by 

[H*X,K*Y\t= f HsKs d[X,Y]s 
Jo 

which abbreviates as d[H •X,K»Y]t = HtKtd[X, Y]t = HtKt dXt dYt. 
Thus the equation for the covariation of stochastic integrals becomes 

{Ht dXt){Kt dYt) = HtKt dXt dYt 

Again: This looks obvious. 

(But it isn't: The dX,dY on the lefthand side are obtained from abbreviation (i), 
whereas those on the righthand side are obtained from abbreviation (ii). Thus our 
notation is consistent.) 

• The Ito formula can now be written as 

k<n k,j<n 

This looks like the familiar second order Taylor expansion. 



102 Outline Stochastic Integration 



Chapter 5 

Girsanov's Theorem and the 
Martingale Representation 
Theorem 



5.1 Changes of Measure and Girsanov's Theorem 

Girsanov's Theorem plays an important role in the construction of equivalent martingale 
measures for processes, and is therefore a cornerstone of the theory of martingale pricing. 
This section gives a reasonably thorough introduction to it. 

5.1.1 Chciracteristic Functions and Stochastic Exponentials 

Suppose that (Xi, . . . is a random vector on a probability space {^l,J^,F). Recall that 
the characteristic function ipxi,...,x„ : ^ C is defined by 

The most important fact about characteristic functions is the following: 

Theorem 5.1.1 (Levy) Two random vectors have the same distribution if and only if they 
have the same characteristic function. 

□ 

The proof of the above theorem may be found in almost any advanced text on probability 
theory. 

As a simple corollary, we have the following useful result: 
Corollary 5.1.2 Two random variables X,Y are independent if and only if (^x,y(s,i) = 

(px{s)(pYit). 

Proof: If X, Y are independent, then 



103 



104 



Girsanov's Theorem and the Martingale Representation Theorem 



Conversely, if suppose that ipx^i^i^) ~ fx{s)<PY{t)- Let X,Y be independent random 
variables such that X has the same distribution as X, and y as y. Then ipx = and 
ipy = ipY- Thus 

^X,Y{s,t) = (px{s)(pY{t) = (pxis)(pY{t) = 'Px,Yis,t) 

as X,Y are independent. It follows that {X,Y) and {X,Y) have the same characteristic 
function, and thus the same distribution. In particular, X is independent of Y, because X is 
independent of Y. 

□ 

Theorem 5.1.3 (Kac) A random vector X is independent of a a -algebra Q if and only if 

E[e'^-^\g] = E[e^*-^] all t 
Proof: (=^) is a basic property of conditional expectation. 

(<^=) We will prove it for the one-dimensional case. Let Y be any ^-measurable random 
variable. Then, using the properties of conditional expectation, 

ipx,Y{s, t) = E[e'^'^+'^^] = E[ E[e*("^+*^)|g] ] = E[e''^E[e''^\g] ] = E[e''^]E[e''^] = ipx{s)ipY{t) 
Hence X is independent of every ^-measurable random variable, and thus independent of Q. 

□ 

The Doleans exponential of a stochastic process M is defined by: 

£{M)t := e^*-^M* 

Observe that £{M)t = f{Mt, [M\t) is a function of two processes, namely Mj, and its quadratic 
variation [M](, where 

Applying Ito's formula, we see that 

dS{M)t = £{M)t dMt - \E{M)t d[M]t + \S{M)t d[M]t 
= £{M)t dMt 

and hence 

dS{M)t = S{M)t dMt 

In particular, £{M)t is a local martingale whenever Mt is a local martingale. 

A nice application of stochastic exponentials and characteristic functions is given by the 
following theorem: 

Theorem 5.1.4 (Levy's Characterization of Brownian Motion) If M = (M/, . . . ,Mf)t is a 
continuous d-dimensional local martingale such that Mq = and [M^,M^]t = 6ijt, then M 
is a standard d-dimensional Brownian motion. 



Girsanov's Theorem and the Martingale Representation Theorem 



105 



Proof: We give the proof for the case d = 1, and leave the extension to higher dimensions 

as an exercise. ^ 

Fix n G M, and define Yt := £{iuMt) := e*"^*+2«'*. Then dYt = iuYt dMt, and hence Yt 

is a local martingale. It follows easily from the fact that lYjl = e2" < cx) that 1" is a genuine 
martingale. Thus Yg = E[lt|J^5], i.e. 

The righthandside is deterministic, so taking expectations on both sides, we obtain 

]g|-gm(Mt-M3)j ^ ^-\u^{t-s) ]g|-gm(Mt-Ms)|jr^j ^ ]gjgm(Mt-Ms)j 

From the first equation, we see that (Mj — Mg) has the same characteristic function as an 
7V(0, t — s)-variables, so that it is an A/"(0, t — s)-variable. The second equation, combined 
with Kac's Theorem, shows that Mt — Mg is independent oi Tg- 

H 

Another useful result is the following: 
Theorem 5.1.5 // / is a deterministic function, and W a standard Brownian motion, then 

Xt:= f f{u)dWu 
Jo 

is a Gaussian process with independent increments, such that 

Xt-Xg^N(^0,l'\f{u)\Uu^ 
Proof: Define Y^ to be the martingale Yt = £{iuXt), and deduce that 

^[e^"/^* •^(") \J^g] = e~^ -^^ '-^^"^'^ 

noting that [X]t = /q du, and using the fact that / is deterministic. Now proceed as 

in the proof of Levy's Characterization. 

H 

A question that is important in mathematical finance is the following: 

Given that M is a continuous local martingale, when is £{M) a genuine martingale? 

But, a matter of policy, we will gloss over the technical differences between martingales and 
local martingales. Here, therefore, we will simply state two criteria that partially answer 
this question. See the book Stochastic Integration and Differential Equations by Protter for 
proofs. 

Theorem 5.1.6 (Kazamaki's criterion) Suppose that M is a continuous local martingale with 
the property that 

supE[e2'^'^] < oo where the sup is over all bounded stopping times 

T 

Then £{M) is a UI martingale. 



106 



Girsanov's Theorem and the Martingale Representation Theorem 



□ 

Theorem 5.1.7 (Novikov's criterion) Let M be a continuous local martingale, and assume 
that 

E[e§[^]-] < oo 

Then S{M) is a UI martingale. 

□ 

5.1.2 Changes of Measure 

When pricing contingent claims, we use risk-neutral valuation: The t = price of a claim X 
is the risk-neutral expectation of its discounted payoff. 

Xo=Eq[X] 

The measure Q is not the same as the "real-world" measure P — we have to change the 
probability measure. 

Suppose that we start with real-world asset dynamics, e.g. a GBM 

—^=lidt + adWt 

on a filtered space (fi, J^, F, F), where Wt is a (F, P)-BM. There are two questions that concern 
us: 

• What happens to the dynamics of St when we change measures? 

• How do we actually go about changing measures? 

This is a good time to recall Bayes ' Theorem for calculating conditional expectations when 
we change the measure: 

Theorem 5.1.8 (Bayes' Theorem) 

Suppose that (0, J^, P) is a probability space equipped with a filtration Fn, O'f'd that Q << P. 
Let ^ = dQ/dF and likelihood process = lEp[^|J^t]. Then 

(a) For any random variable Z (integrable w.r.t. P and Q) we have 

(b) // Q P, then a stochastic process Xf is a martingale under Q if and only if ^t^t is a 
martingale under P. 

The proof is an exercise: 

Exercise 5.1.9 Suppose that (17, J^, P) is a probability space equipped with a filtration 
and that Q << P. Let ^ = dQ/dF and define = lEp[^|J^t], where Ep refers to expectation 
w.r.t. the measure P. 



Girsanov's Theorem and the Martingale Representation Theorem 



107 



(a) Show that for any random variable Z (integrable w.r.t. P and Q) we have 

(b) Show that if Q fa P, then a stochastic process Xt is a martingale under Q if and only if 
Ct^t is a martingale under P. 

[Hint: (a) I'll give you the proof. You justify every step: Let A G J^^i- Then 

/ ^tMZ\J't]dF= f ErmQ[Z\Tt]\Tt] dF 

J A J A 

J A 

= f ZdQ 

J A 



I A 

= f Z^dF 

J A 



(b) Use (a). 



□ 



Theorem 5.1.10 Girsanov's Theorem for Brownian Motion) 
Suppose an d-dimensional process Y has F -dynamics 

dYt = tMdt + at dWt {t < T) 

where W is a standard d-dimensional F-Brownian motion, Ht{oj) G M",crt(a;) G M'^^'^. Let 
\t{oj) G K'' he predictable. Define a measure Q on Tt by 

'^^ 

Assume that Novikov's condition holds: 

E re§io^ll^''ll'H < DO 

Then: 

(i) Q is a probability measure on Tt- 

(ii) Wt = Wt- /o A, ds is a Q-Brownian motion. 
(Hi) The Q-dynamics ofY are given by 

dYt = {fit + (^t^t) dt + at dWt 

Proof: We prove the result for the case d= 1, and leave the extension to higher dimensions 
as an exercise. 

By Levy's characterization, it suffices to show that Wt is a continuous local martingale 
with [W]t = t under Q. Now certainly 

d[W]t "=" {dWtf = d{Wt - Xt dtf = {dWtf = d[W]t = dt 



108 



Girsanov's Theorem and the Martingale Representation Theorem 



so that [W]t = [W]t = t. 

Now let 6 := e^'' 2 /o H^^H' '^^ = ^[^t\H- Then = ^ and is a P-martingale, 

with d^t = ^tit dWt- To prove that Wt is a Q-local martingale, it suffices to prove that ^tW"* 
is a P-local martingale, by Bayes' Theorem. But 

= Wt dit + 6 dWt + d[Wt, it] 

= Wt dit + it dWt - itXt dt + {dWt){it\t dWt) 

= Wtdit + itdWt 

Since both it and Wt are P-local martingales, it follows that itWt is the sum of two stochastic 
integrals w.r.t. P-local martingales, and thus a P-local martingale. 

H 

Remarks 5.1.11 This has important consequences: Suppose, under the "real world" P, we 

have an asset 5" the follows a geometric Brownian motion with drift parameter fi and volatility 
parameter a. Suppose further that the continuously compounded rate r is constant, and let 
St = e~'^*St denote the discounted asset price. Thus: 

dSt = St[ndt + a dWt] i.e. dSt = St [{n -r)dt + a dWt] 

Suppose we now construct a new measure Q as above. This Girsanov transformation with 
kernel A adds a\ to the drift of 5, but does not change the volatility: 

dSt = - r + aX) dt + a dWt] 

We will see that arbitrage-free pricing must be done by computing expectations under a 
risk-neutral measure, or equivalent martingale measure. This is a measure under which the 
discounted asset price process St is a martingale, and thus has zero drift. Now Q to be a 
risk-neutral measure, we must have (/i — r + aX) = 0, i.e. 

aX = r — 11 so that dSt = Stcr dWt 

This translates to A = — i-e. the Girsanov kernel is minus the market price of risk. 

The fact that a Girsanov transformation does not affect the volatility is also important: 
It implies that we can use real-world observations to estimate risk-neutral world volatility. 

□ 

5.2 The Martingale Representation Theorem 
5.2.1 Motivation 

Arbitrage methods only yield a unique price for a contingent claim X when it is possible to 
replicate X, i.e. when there is a portfolio 9 and an initial amount Xq such that 



X = Xo + Gt{9) or equivalents X = Xq + Gt{9) 



Girsanov's Theorem and the Martingale Representation Theorem 



109 



, where X refers to the discounted payoff of X. For simpHcity, assume that our model has 
only one risky asset (stock) S. Now in discrete time, the gain on a self-financing previsible 
portfolio 9 involving stock and bank account is given by 

T T 

GT{e) = Gk{Sk - Sk-i) = Y.^k AkS 

k=l k=l 

where 6^ is the amount of stock held over the interval [A; — 1, k] and must be -measurable. 
Interpolating this expression to continuous time, it is reasonable to model the gain of a 
continuously traded portfolio by a stochastic integral: 



Gt{9)= r 9tdSt 
Jo 



where 9 is predictable, and S a semimartingale (so that the stochastic integral is defined.) 

Assume now that there is a risk-neutral measure Q, i.e. that the discounted asset price 
process St is a (J^t, Q)-martingale. If the stock price dynamics are given by a geometric 
Brownian motion with volatility parameter a, then the risk-neutral dynamics are of the form 

dSt = aSt dWt 

where Wt is a {Tt,Q)-BM. To hedge an arbitrary contingent claim X, we need to find a 
predictable process 9 such that 

Xt = Xo+ [ 9taSt dWt 
Jo 

Now define 

Zt = EQ[XT\Tt] 

so that Zt is a (Jt, Q)-martingale. (Ignore the distinction between martingales and local 
martingales for the moment.) If we can write the martingale Z as a stochastic integral w.r.t. 
Brownian motion, i.e. if there exists a predictable process {Ht)t such that 

Zt = Zo+ [ Hs dWs 
then X has a replicating portfolio, namely 







Thus the problem of finding a replicating portfolio for X reduces to finding a way to represent 
the martingale Zt = Eq[Xt| Ji] as a stochastic integral. 

5.2.2 Statement of Main Results 

Theorem 5.2.1 (Martingale Representation Theorem) 

Let W be a d-dimensional Brownian motion on a probability space {0,,J^,¥), and let ¥ = 
iJ^t)t>o be the canonical filtration, augmented to satisfy the usual conditions. Then every 



110 



Girsanov's Theorem and the Martingale Representation Theorem 



(¥,¥) -local martingale is representable as a stochastic integral, i.e. for any martingale M 
there exists a (d-dimensional) predictable process H such that 

Mt = C+ f HsdWs 
Jo 

In particular, every {¥,¥) -martingale has a continuous modification. 
This will be a straightforward consequence of the 
Theorem 5.2.2 (Ito Representation Theorem) 

For any X G J^oo,P); there exists a unique predictable process H with E[/o dt] < oo 

such that 

poo 

X = E[X]+ / HsdWs 
Jo 

The above results, and their proofs below, are taken from Continuous Martingales and Brow- 
nian Motion, by Revuz and Yor. 

5.2.3 Preliminary Technicalities 

We aim first to prove the Ito Representation Theorem. We need a few elementary results 
from Hilbcrt space theory. Let {E, {■, •)) be a Hilbert space. 

Definition 5.2.3 A subset X C E us said to be total if the linear span of X (i.e. the set of 
all linear combinations of vectors in X) is a dense subspace of E. 

□ 

Thus a subset X of £' is total if every element of E can be approximated arbitrarily closely 
by a linear combination of elements of X. 

Proposition 5.2.4 A subset X C E is total iff whenever 

(e, x) = for all X E X e = 

Proof: First assume that X C E is total, so that E = cl(span(X), and e £ E. If 
(e, x) = for all x G X, then also (e, /) = for all / G span(X). Choose Xn G span(X) such 
that Xn e. Then (e, x„) — > (e, e), so (e,e) = ||e|p = 0. hence e = 0. 

Conversely, assume that X C E satisfies the stated condition. Let F = cl(span(X). 
We must prove that E = F. So let e G £^ be arbitrary, and write e = + e±, where 
e|| G F, ej^ _L F. Then {e±,x) = for aU x E X, so that e = e|| G F. Hence E Q F, as 
required. 

H 

The Ito Representation Theorem is a statement about the Hilbcrt space L'^i^^, J~^,^^. 
We begin by finding a convenient total subset of this space. For simplicity assume that W is 
a 1-dimensional Brownian motion. Let S be the set of deterministic step functions on M+, 
i.e. those functions / which can be written 

n 
i=l 



Girsanov's Theorem and the Martingale Representation Theorem 



111 



Lemma 5.2.5 The set £{S) ■ 
nentials is total in J^ooJ 



{£{f)oo = e^o"f^^^'^^*-yo°°f(i)dt : f eS} of Doleans expo- 



Proof: Let Y G L'^{n, T^,¥) be such that E[£{f)o^ ■Y]=0 for all f eS.We want to show 
that Y = a.s. Fix = to < ti <■■■< tn, and define : ^ C by 

ip{zi, ...,Zn)=E [e^?=i . Y 

Now if Ai, . . . , An G M, and / = Z]=i {tj-i,tj]i then 

^(Ai, . . . , A„) = e§ fo^ "'nSifU ■¥]=() 



It is now not too hard to believe that <y?(iAi, . . . , iA„) = for all Ai, . . . , A„ (To be precise: 
Noting that is a Gaussian process, it is not hard to see that is analytic. Since ip = on 
M", we must have 99 = on C", by analytic continuation.) 



The expression (p{iXi, . . . , iXr 



E 



looks a little like a charac- 



teristic function, i.e. a Fourier transform. To make this explicit, define a signed measure /j, 
on iW\B{R^)) by = E [lA{Wt, - Wt„ . . . ,Wt^ - Wt„_,) ■ Y] , i.e. ^ = i^X-\ where 

diy = Y dF s^ndX = {Wt, -Wt„...,Wt^- Wt^_,). Then 



(p{iXi,. . . ,iXn) = J < 



,i{A,3;> 



li{dx) 



X — (Ai, . . . , A„) 



is the Fourier transform of the measure jJL. Since it is identically zero, by the Fourier inversion 
theorem (analogous to Levy inversion), the measure is identically zero. Now clearly (j{X) = 
a{Wt^ , ■ ■ . , Wt^), so the measure i' is zero on (T{Wti 1 ■ ■ ■ 1 ^t„), for all = to ^ ^ • • • < ^n- 
Now let {qn : n G N} enumerate a dense subset of M", and let Tin = '^{^gn • • • > ^g-n)- 
We've just proved that u is zero on each Hn- Fix an arbitrary bounded J^oo-measurable 
variable Z, and let Zn = M[Z\T-Ln\- By the martingale convergence theorem, Z„ — >■ Z a.s. 
and in L^. Hence, since Z is bounded, ZnY — t- ZY a.s. and in L^. But since v is zero on 
Hn, we have E[Znl^] = j Zn dv = for all n. It follows that E[Zy] = for all bounded 
J^oo-measurable R.V.'s Z. Take Z = Y A N ioi sufficiently big N to conclude Y = a.s. 

H 

Further recall the following: In constructing the stochastic integral, we made use of the 
following isometry for square— integrable martingales M: 

WHWmM) = \\H • M\\m2 i.e. E H^d[M]t =E H HtduA 



5.2.4 Proof of Main Results 

Proof of Ito Representation Theorem: Uniqueness is a simple consequence of the ltd 
isometry: Indeed, if {H mW)^ = {K*W)oo, then \\H - K\\l2^w) = \\{H - K)»W\\m^ = 0, 
so that H = K in L'^{W). 

Denote by H the set of aU X e L'^{n,J^oo,'P) which can be represented in the form 

/•oo 

X = E[X]+ / HtdWt 
Jo 



112 



Girsanov's Theorem and the Martingale Representation Theorem 



for some predictable H with ¥,[f^ Hf dt] < oo. By linearity of the integral, Ti is a sub- 
space of L^. Moreover, £{S) C Ti: Indeed, if / = Ylj=i ^jl{tj-i,tj] ^ '5, then d£{f)t = 
£{f)tf{t) dWt — the well-known SDE satisfied by Doleans exponentials — so that S{f)oo = 

i+!^£{f)tm dWt. 

Since £{S) is total and £{S) C it suffices to show that % is closed. So let be a 

Cauchy sequence in L^, i.e. 

¥.[{Xn-Xmf]=E[Xn-Xmf + \\{H'' -H'^).W\\%2^{) as n,m^oo 

We can conclude two things: (i) E[^}i — Xf^i^ — >■ 0, from which we deduce that (E[^j2])n 
is Cauchy, hence convergent, and (ii) IK-ff" — i?™) • W\\'j^2 — ^ 0, from which we deduce 
that (iJ") „ is a Cauchy sequence in £?(W), by the isometry, and hence convergent to some 
predictable H. It is now easy to see that the sequence converges to 

limE[Xn]+ / HtdWt 
" Jo 

which belongs to H. Since every Cauchy sequence in H converges, it is closed. 

H 



Proof of Martingale Representation Theorem: First suppose that M is a square- 
integrable martingale; then there is a unique predictable H such that 



/•oo 

Moo = E[Moo] + / Ht dWt 
Jo 



Then Mt = E[Moo\J-'t] = Mq + /q Hg dWg. It follows that square-integrable martingales have 
continuous versions. 

Next suppose that M is a uniformly integrable martingale (so that it converges to Mqo 
a.s. and in L^). Since L'^{Q,Too,^) is dense in L^{Q,Too,^), we can choose random variables 
M^ G L2 such that E[|M^ - M^ol] 0. Define martingales by = E[M^|J^t]. Then 
the M" are square-integrable, and thus continuous. By Doob's maximal inequality, we see 
that 

sup|Mt-Mj"| > A <A-^E[|M^-Moo|]^0 
t J 

By Borel-Cantelli Lemma we can pick a subsequence {M'^'^)^ which converges uniformly to 
M. Hence M has a continuous version. 

Finally, let M be a local martingale. Then there exists a sequence of stopping times T„ t oo 
such that each stopped martingale M"^" is uniformly integrable, hence continuous. Modifying 
the Tn, if necessary, we may assume that each M^" is bounded, hence square-integrable. By 
the first part of the proof, we see that 

rTnf\s 



/ HI 

Jo 



mJ^ =Mo+ I dWs 



for some unique predictable iJ". Uniqueness ensure furthermore that if m < n, then iJ", 
coincide on (0,r^], and we may denote this common value by H. The result follows. 

H 



Chapter 6 

SDEs and PDEs 



6.1 SDEs: Existence and Uniqueness of Solutions 

An ODE is some functional relationship 

fit,x'it),x"it),...)=0 
involving, say, time t and an unknown function x and its derivatives. 
Example 6.1.1 Consider the following population growth model 

— = a{t)N{t) N{0) = No 
where N{t) is the size of the population at time t. Of course, this is easy to solve: 

N{t) = Noe^o<'(')d' 

Here a{t), the relative growth rate, is deterministic. However, we can easily imagine there to 
be "noise" in the system, i.e. 

a{t) = r{t) + "random noise" 

If we interpret ("noise") - dt to be some random perturbation a{t) dWt, where Wt is a standard 
Brownian motion, then we obtain 

dN{t) = r{t)N{t) dt + a{t)N{t) dWt 

We choose to interpret this in the Ito sense: 

N{t) = No+ [ r{s)N{s) ds + [ a{s)N{s) dWs 
Jo Jo 

If r, a are constants, this becomes our old friend geometric Brownian motion 

dNt 



Nt 

whose solution we know: 



= r dt + a dWt 



N{t) = Ar(0)e('-2'^ )*+^^* 
as you can easily verify by Ito's formula. 

113 



114 



SDEs: Existence and Uniqueness of Solutions 



□ 



Definition 6.1.2 An Ito diffusion is an SDE of tfie form 

dXt = b{t, Xt) dt + a{t, Xt) dWt 

where 



b : R+ 
a : R+ 



j^n X m 



Here, Xt is an ra-dimensional stochastic process, and Wf is an m-dimensional Brownian 
motion. Thus 



/dX^\ fb\t,Xl...,X^)\ /ai,{t,Xl,...,X^) ... 

: = ; dt+ : : : 

\dXp) \b{t,Xl...,Xn) \aniit,Xl...,Xl^) ... anm{t,Xl,...,X^)J 

which is to be interpreted as a system of stochastic integral equations 

Xl = Xl,+ b\s,Xl,...,X^)ds + Y, a,kis,Xl,...,Xl 
Jo ^^^Jo 

We denote the SDE above by SDE(c7, b). 



'dWl\ 
^dW^j 



dw: 



□ 



As with ODE's, there is a theorem which guarantees that, under certain conditions, solu- 
tions solutions exist and are unique. Before we state this result, it's important to note that 
there are two common notions of solution to an SDE. Given SDE((T, b): a solution is a triple 
{Xt, Bt, J^t ) such that 

(i) Bt is an J^^*-Brownian motion; 

(ii) Xt satisfies 

Xt = XQ+ [ b{s,Xs)sds+ [ u{s,Xs)dBs 
Jo Jo 

If Xt is adapted to the filtration generated by Bt, then Xt is called a strong solution. In 
essence, given a Brownian motion Bt, we can the construct a solution to SDE(c7, 6) from 
this Brownian motion Bt. However, it may be impossible to solve SDE(cr, b) using a given 
Brownian motion, but nevertheless possible to solve it by constructing Xt and a different 
Brownian motion — i.e. it is necessary to construct Xt and Bt simultaneously. In that case, 
we call the solution a weak solution. 

We are mainly interested in strong solutions. We say that a solution is (pathwise) unique 
if given any two solutions {Xt, Bt, Ft) and {X't, Bt,T't) to SDE(cr, b) with Xq = X{^ = x, driven 
by the same Brownian motion Bt, we have, with probability one. 



Xt = X't for alH > 



Without proof, we state: 



SDEs and PDEs 



115 



Theorem 6.1.3 (Existence and Uniqueness Theorem) 
Let r > 0. Given the following: 

• An m-dimensional Brownian motion Wt; 

• An SDE SDE{a,b); 

• An n-dimensional random variable Z independent of {Wt)t<T with EZ^ < oo (in par- 
ticular, Z may be constant). 

Suppose that there is a constant C such that 

(i) The following local Lipschitz condition holds: For all x,y E and all < t < T we 

have 

\\b{t,x)-b{t,y)\\<C\\x-y\\ 
\\ait,x)-ait,y)\\<C\\x-y\\ 

(ii) The following Unear growth condition holds: For all x € M", < i < T 

\\b{t,x)\\ + \\a{t,x)\\<K{l + \\x\\) 

Then there exists a unique strong solution Xt to the SDE 

dXt = bit, Xt) dt + ait, Xt) dWt 
Xo = Z 

Moreover 

1. Xt is adapted to the natural filtration (augmented) generated by Z and Wt- 

2. Xt has continuous sample paths. 

3. Xt is a (strong) Markov process. 

4. E J^WXtW^ dt < oo. 

□ 

The above is not the best possible theorem. In the 1-dimensional case, particularly, it 
can be strengthened considerably. 

6.2 The Linear SDE 

In this section we "solve" the one-dimensional linear SDE 

dXt = Mt)Xt + 62 (t)] dt + [aiit)Xt + (72(0] dWt 

where bi,ai are deterministic continuous functions (and thus bounded on compact intervals). 
The Existence and Uniqueness Theorem guarantees the existence of a unique strong solution. 
We will solve this SDE in three steps. 



116 



The Linear SDE 



6.2.1 The Lineeir SDE with additive noise only 

Consider 

dXt = [hXt + 62] dt + (72 dWt 

(i.e. o"! = 0.) An ordinary linear DE x'{t) = b{t)x{t) + u{t) is solved using an integrating 
factor y(t) = e^-^^^*) which reduce the problem to (x(t)y(t))' = u{t), which can be solved 
by integrating both sides. We try the same trick here. So put 



y{t) = e-fobi{s)ds Yt = y{t)Xt 



Then by Ito's formula 



dYt = ~bi{t)y{t)Xtdt + y{t)dXt 

= yit)[-biit)Xt + biit)Xt + 62(0] dt + (72 (t) dWt 
= y{t)b2{t) dt + y{t)a2{t) dWt 

Note that Yq = Xq, because y(0) = 1. Integrating the above equation, we obtain 



The solution to the SDE 

dXt = [bi{t)Xt + b2{t)] dt + C72(i) dWt 

is given by 



Xo+ [ b2{s)y{s)ds+ [ a2{s)y{s)dWs 
Jo Jo 



where y{t) = e-^o^iW _ 



Example 6.2.1 The Langevin Equation 

dXt = cXt dt + a dWt 

Here bi{t) = c = constant, 62 (t) = 0, C7i(t) = 0, and (T2{t) = a = constant. 
Thus the integrating factor is y{t) = e~ ^o'^'^^ = e"*^*, and hence 



Xt = e"*Xo + a f e<^-''> dWs 
Jo 



Now recall that if h{t) is deterministic, then Z = h{s) dWg is Gaussian, with Z ~ 
A''(0, Jq h{s)'^ ds) (using the Ito isometry). It follows that we know the distribution of Xf-. Xt 
is normally distributed with mean e^E[Xo] and variance cr^ 



□ 



Example 6.2.2 The Vasicek Model: 
This is a short rate model: 

drt = 



c[iJ, — rt] dt + a dWt 



SDEs and PDEs 



117 



where yU, a are constant. Thus hi{t) = — c, 62 (t) = c/i, o'i(t) = and cr2(f) = cr. 
The integrating factor is y{t) = exp(jQ c ds) = e^*, and so 



rt = e 



-ct 



ro+ I cixe""" ds+ I ae"^ dWs 
Jo 
t 



roe-"* + n[l - e-"*] + a f e'^^-'^ dWs 

Jo 



Thus rt is normally distributed with mean roe~'^* + /i(l — e"'^*) and variance — . As 

i — > 00, the short rate "forgets" its initial value ro and lies approaches a normally distributed 
random variable with mean fj, and variance S 



2c- 



□ 



6.2.2 The Homogeneous Linear SDE with Multiplicative Noise 

Next, we consider 

dXt = bi{t)Xt dt + ui{t)Xt dWt 

i.e. 62(0 = = (72 (t). If 61, 0"! are constants, the above SDE is simply geometric Brownian 
motion. So try the following: Define = InXj. Then 

dYt = l-dXt-^ d[X]t 
1 



and thus 



so that 



Mt)-^(Ti{tf]dt + ai{t) dWt 



Yt = YQ+ f bi{s) - laiisf ds + r (71 (s) dWs 
Jo ^ Jo 



The solution to the SDE 

dXt = bi{t)Xtdt + ai{t)Xtdt 

is 

= Xo£ i^j bidt + J ai dWt 

In the case of additive noise only, we saw that the solutions are Gaussian processes. In the 
case of homogeneous SDE's with multiplicative noise, solutions are lognormally distributed. 

6.2.3 The General Linear SDE 

Given 

dXt = [bi{t)Xt + 62 (t)] dt + [ai{t)Xt + (72 (i)] dWt 
let Yt be the solution to the corresponding homogeneous SDE 

dYt = bi {t)Xt dt + (71 {t)Xt dWt Yo = 1 



118 



Solving PDEs Probabilistically 



An application of Ito's formula shows that 

(X \ 1 X 1 "IX 



Y, 



[ib2it)-ai{t)a2{t))dt + a2it) dWt] 



Integrate: 



and thus 



Yt Yo 



Y, 



Y, 



The solution to the SDE 

dXt = {bi{t)Xt - b2{t)) dt + {ai{t)Xt + a2{t)) dt 

is given by 



Xt = Yt 



Xo+ i'm^^i^ds+ f'-^dWs 

Jo 



where Yt solves the corresponding homogeneous SDE with Yo = 1: 



6.3 Solving PDEs Probabilistically 

6.3.1 Solving the Heat Equation Probabilistically 

We show here how it is possible to solve certain PDE's by running a Brownian motion. For 
simplicity's sake consider the d-dimensional heat equation, with initial conditions: 



where u : M"'" x 



ut(t,x) =2^'" 
u{0,x) = f{x) 

is C^'^ (and x = {x^, . . . ,x'^)). For example, in two dimensions. 



(*) 



du 1 
at " 2 
u{0,x,y) = f{x,y) 



d'^u d'^y' 
dx^ dy^ 



We will also write Vu = {§^,..., 0). 
To begin with, we note that 

Proposition 6.3.1 If u satisfies (*), then Mg = u{t — s,Bs) is a local martingale on [0,t). 
Here Bt is a d-dimensional Brownian motion (not necessarily null at 0). 



SDEs and PDEs 



119 



Proof: Let X^^ = t - s and = for 1 < i < d. Further define Ys = u{X^ , . . . , X^) = 
u{t — s, Bg), and apply Ito's formula to obtain 

dYs = -utit - s, Bg) ds + Vn(t - s, Bg) ■ dBg + ^ An(t - s, Bg) ds 

where we use the fact that ^[-B*]^ = ds, d[B'', B^]s = for i 7^ j both > 1. Hence 

Ms = u{t - s, Bs) = u{t, Bo) + [ Vu{t - r, Br) ■ dBr 

Jo 

is a local martingale. 

□ 

Now let Bt be a Brownian motion starting at x € M'^. Then 

u{t,x) = Mo =E-[Mt] =E-[^x(0,St)] =E-[/(5t)] 

where denotes the expectation under a measure where Bt starts at x. Thus we can solve 
the heat equation as follows: 

(i) Start a Brownian motion from x, and let it run for a time t. 

(ii) Plug the value Bt into the initial condition / to obtain a random variable f{Bt). 

(iii) u{t,x) is the expectation of f{Bt): u{t,x) = W[f{Bt)] 

Thus it is possible to solve the heat equation by running a Brownian motion. With some 
modifications, we can solve many parabolic and elliptic problems by running some stochastic 
process and taking expectations. We now begin the process of making this precise. 

6.3.2 The Black-Scholes PDE: A Heuristic Approach 

Using Ito's formula, it is not hard to derive a partial differential equation for European-style 
derivatives. 

Consider again market with a share St whose price process is given by a geometric Brow- 
nian motion, i.e. satisfies the SDE 

dS = iJ,S dt + aS dBt 

Let the risk-free interest rate be r, and let At be the riskless bank account, with dynamics 

dAt = rAt dt 

Let V{t, St) be European-style derivative whose value depends on both the share price and 
time. Consider a portfolio 11 which contains 1 derivative, and n shares, i.e. its value is 

Ut = Vt + nSt 

A small amount of time dt later, the share price has changed. The value of the portfolio 
changes by 

dnt = dVt + ndSt 



120 



Solving PDEs Probabilistically 



By Ito's Formula, 



dV , dV ,^ 



dV 



dV 1 2a2d''V\ .. ^dV 



dt + aS— dBt 
ob 



Hence 



(dV \ 



Thus 



fdV 



dV 



+ aS 



dV 
dS 



+ n 



dS 
dBt 



+ n 



^1 2^2d'V\ 



Now if we take n 



(i.e. the portfoUo is short — §^ shares), then the portfoho is 



95 



unaffected by the random changes in stock prices: 



dt 



Thus, for a brief moment, the portfoho is risk-free. By a no-arbitrage argument, it must earn 
the same return as the risk-free bank accoun^ i.e. 

/ dV 
dUt = rUtdt = riv - —S] dt 



Equating (6.3.2) and (6.3.2), we get 



dV 
~dt 



8V 



ds^ ds 



rV = 



This is the famous Black— Scholes PDE. It is a second-order parabolic PDE, i.e. essentially 
a heat equation. Most of the PDE's encountered in finance are of a similar type. 

Note that if a portfolio contains ^ shares, then the change in the portfolio value is the 
same as the change in the value of the derivative. The quantity ^ is called the delta of the 
derivative. One can thus synthetically replicate any European style derivative with underlying 
share S by holding, at any time, delta-many shares. This procedure is called delta hedging. 

Consider a European call option C on a share S with strike K and maturity T. The 
volatility of the underlying share 5" is o" and the risk-free rate is r. To find the value of the 
call option, we must solve the following boundary value problem: 



dC 1 

-ot^- 

C{T) = 



.^2d^C dc 
maxl^T- — 0} 



rC7 = 



^This is the crux of the argument! 



SDEs and PDEs 



121 



6.3.3 The Feynman-Kac Theorem 

Consider an (n-dimensional) SDE 

dXt = fi{t, Xt) dt + a{t, Xt) dWt 

with d sources of noise. Thus fi is an n-vector, and a an n x d-matrix. 

Let fit, x\ . . . , x") : M+xM" ^ M be a C^'^-function. Let V^/(t, x\ . . . , x"[ 
and let C = aa^^ . Note that C is an n x n-matrix, and that Cij = ai ■ aj, where ai is the z*^ 
row of the matrix a. 

By Ito's formula 



/ df df 



n „ „ a 

df{t,x,) = ^^^Y.^,,dwi + 
i=i j=i 



dl 
dt 



1=1 



dt 



I.e. 



df{t,Xt) = Va,f-adWt + 



dl 

dt 



„ ^ 1^^ 9V 



i',3 



dt 



Definition 6.3.2 The infinitesimal generator ^ of a diffusion Xt (satisfying dXt = fi dt + 
a dWt) is defined by 

i=l i,j=l 



where C = aa' 



tr 



□ 



Thus 



dfit,Xt 



dl 
dt 



dt + Vxf ■ o- dWt 



Consider now the following Cauchy problem: 



dV , , , ,dV In. 
V{T,x) = 



(*) 



We will solve this PDE probabilistically, by running a diffusion. 

To solve it, we must find V{t, x), for < t < T and x G M. So fix t < T and x G M, and 
define a 1-dimensional diffusion X to be the solution of 

dXs = i^{s, Xs) ds + a{s, Xg) dWg t<s<T 
Xt = x 



Thus X starts running at time t from point x. The infinitesimal generator of X is 



122 



Solving PDEs Probabilistically 



Thus (*) is just 

dV 

V{T, x) = ^x) 
Applying Ito's formula to the process V{t, Xt) yields 



dV 

dVt = -^a dWt + 
ox 



dt 

Since the term in brackets is zero (Cauchy problem), we see that 



9v 



dV 

V{T,XT) = V{t,Xt) + a{s,Xs)-^{s,Xs)dWs 

and thus, noting that V{T,Xt) = ^{Xt), that Xt = x, and taking expectations on both 
sides, that 

= e*'^[$(Xt)] 

Thus the solution V{t,x) to the Cauchy problem can be obtained by running an SDE from 
point X at time t, waiting until time T, and finding the average value of the random variable 
^{Xt). The superscripts t,x on E*'^ simply denote that Xt = x. 

Now (*) is not quite the Black-Scholes equation, which has an additional term. However, 
this can be removed. We obtain the following general, multidimensional, vesrion: 

Theorem 6.3.3 (Feynman-Kac Theorem) 

Given 

• A (column) vector-valued funcfAon /i : x R" — )• M"; 

• A matrix-valued function a : M+ x M" ^ M"^''; 

• A matrix-valued function C which is of the form C = aa^'^ ; 

• A real-valued function r : x M" ->■ M; 

• A real-valued function $ : M" — )■ M; 

Given a solution V{t,x) to the boundary value problem 

dv ^ dv 1 ^ d'^v 

i=l i,k=l 

V{T,x) = $(x) 

and assuming sufficient integrability, we can calculate V{t, x) as follows: 

(i) Fixt<T and x G M; 

(ii) Let Xs be the solution to 

dXs = i^{s, Xs) ds + a{t, Xg) dWs t<s<T 
Xt = x 

Then 

V{t,x) = E*'^ U^<''^^) '^'^Xt) 



SDEs and PDEs 



123 



Proof: Let 

Then 



+ e 



r{u,Xu) du 



V^V ■adWs + 



dv ^ dV_ iv-^ 



ds 



Thus = F< + J^^e/t'''("'^") '^"V^y • a dWs. Now note that Yt = V{t,x), and that Yr = 
eft ^(S'-'^'') Taking expectations yields the result. 

□ 

Remarks 6.3.4 Consider a European contingent claim C on a share S with payoff $(57-) at 
time T. Assuming that share prices follow a geometric Brownian motion (with constant drift 
and volatility), and that the interest rate r is constant, the Black-Scholes PDE to be solved 
is 

dC BC Id'^C 

+ rS— + 



dt 



~rC = 



dS 2dS^ 

C{T, St) 

Using the Feynman-Kac Theorem, we obtain the solution as follows: Find the solution to the 
following SDE: 

dSt = rSt dt + aSt dWt 

This diffusion is not the share price, although we've denote it by the same symbol. It is simply 
a solution to the above SDE, which is obtained directly from the Fcynman-Kac formula. 
However, it looks exactly like the SDE for the share price in the riskneutral world\ 
Run it until time T. Then 

(7(0, So) = E°'^o <^^^St) 

Thus the price of the option is its discounted expected value, where the expectation is taken 
under a measure where S follows a geometric Brownian motion with drift rate r and variance 
rate a. 

□ 



We have therefore reconciled the stochastic (riskneutral) and PDE ap- 
proaches to pricing derivatives via the Feynman-Kac formula. 



However, it should be pointed out that the riskneutral approach works in a general "semi- 
martingale context" (where prices, rates, etc. are semimartingales), and not just in a "diffu- 
sion context" (where prices, rates, etc. are given as solutions to Ito diffusions, and are thus 
necessarily Markov processes) . Hence the stochastic approach to finance is considerably more 
general. 



124 Solving PDEs Probabilistically 



Chapter 7 

Financial Modelling in Continuous 
Time 

7.1 Stochastic Financial Modelling 
7.1.1 Bcisic Notions 

We give here a quick introduction to the basics of stochastic financial modeUing. We start 
with the following set-up: 

• A market model is a tuple 

where {Q, T, P) is a probability space, {J-'t)t a filtration satisfying the usaul conditions, 
and St = {Sf, . . . S^) an {N + l)-dimensional adapted cadlag semimartingale. 

• We will often assume a finite horizon [0,T], e.g. to price European options. 

• We also make the usual assumptions about the market: 

— No transaction costs 

— Continuous trading 

— Liquid markets for every security 

— Short sales allowed 

— Perfect divisibility of assets 

To get results, we will usually specialize: We will generally assume that CI comes with a 
X-dimensional Brownian motion Wt = iW^, . . . , W^) which generates the filtration J^t (aug- 
mented to satisfy rthc usual conditions). We say that we have K sources of noise. Further, 
we assume that the asset price process is given by an Ito diffusion: 

dSt = f^{t, St) dt + a{t, St) dWt 

which is shorthand for 





(S't\ 




(^^\t,St)\ 


/aoi{t,St) . 


.. aoK{t,St)\ 




d 






: \ dt + 














V'^it^St)) 


\am{t,St) . 


■■ 0-NK{t,St)J 





125 



126 



Stochastic Financial Modelling 



Under these conditions, the asset price process is (strong) Markov. 

Generally, we make another assumption on Sf: we assume that it is the money market 
account process ( "riskless" bank account process) , which has dynamics 

dS^ = rS^ dt Sl = l 

A numeraire is a price process Nt which has Nt > a.s. Think of a numeraire as a unit 
into which other assets are translated. Thus if St is the price of S in money, then St = is 
the price of S in units of N. 

We often choose the numeraire to be the money market account process St - In that case, 
we write St = ^ for the value of St in terms of the numeraire. Of course, St is just the 
discounted value of S at time t. 

A European contingent claim C is an derivative which, at some future time T has a payoff 
which is a known function of asset prices at time T, i.e. 

Ct = /{St) 

so that Ct is an Jr-measurable random variable. The time T is called the maturity or 
exercise time of the claim. 

A central problem is the pricing and hedging of such derivatives. A European claim can 
be priced by arbitrage methods only if there is a trading strategy which exactly replicates its 
payoff. 

Definition 7.1.1 "A trading strategy/portfolio is a left-continuous (or, more generally, 
predictable) process (pt = {4>^, . . . , (p^) which is integrable w.r.t. the semimartingale St- 

(f>t is to be thought of as the number of asset held in the portfolio at time t. 

• The value process of the portfolio is given by 

N 

Vt{<l>) = (t>fSt = Y,<t>tS'^ 

n=0 

and the gains process by 

Gt{(/)) = Vt((^) - Fo(0) 

• If is a numeraire, we may also introduce num'eraire-defiated ("discounted") value- 
and gains processes by 

Vt(0) := Gt{cf>) := Vt{<l>) ' M<l>) 

• A trading strategy (f) is self-financing if and only if d{^t • St) = (f)t ■ dSt, i.e. if and only 
if ^ 

Gt(<^) dSu 
Jo 



□ 



Financial Modelling in Continuous-Time 



127 



Remarks 7.1.2 • The intuition behind the self-financing condition is a bit convoluted. 
Discretize time, and suppose that cpt is the portfolio held over a period [t,t + At]. To 
be self-financing means that the value of the portfolio doesn't change purely because of 
rebalancing. Thus, at time t, the portfolio 4>t-At is rebalanced to become the portfolio 
4>t. The value at time t of these portfolios is the same: (f)t-At ' St = (f>t ' St, i.e. {(pt — 
(pt-At) ■ St = 0. It is tempting to deduce that, in the limit At 0, we obtain the 
self-financing condition 

St d(t>t = (*) 

However, it would be wrong to use (*) as the self-financing condition in continuous- 
time, because: 

(i) Stochastic integrals are to be interpreted in the Ito sense. 

(ii) If Ht is left-continuous, then the stochastic integral 

/ Ht dXt = lim y^Ht JXt - Xt J 

is a limit (in probability) of left-hand Riemann-Stieltjes sums. 

(iii) St{4>t — (t>t-At) looks like a term in a right-hand sum. 

This problem is fixed rather easily: Add and subtract St-At^t<i> from the left-hand side 
of (*) to obtain: 

St-At{4>t - (t>t-At) + {St - St-At){(t>t - (pt-At) = 

In the continuous-time limit, this looks like 

Stdcf>t + d[S,cj>]t = (**) 

because the first term is a left-hand sum, and the second term looks like a summand in 
the covariation process. By Ito's formula, we have d{^t'St) = (pfdSt+Sfdcpt+dlS, 4>]t = 
(j)t ■ dSt- (But we must stress that the above argument is a purely intuitive formulation 
of the self-financing condition, as trading strategies need not be semimartingales, so 
that terms like St ■ dcpt and d[S, need not make sense.) 

• In the literature, other conditions are often imposed on trading strategies to ensure 
that they are sufHciently well-behaved. For example, a self-financing trading strategy 
is called tame if Vt{(j)) > a.s. It is called admissible if its discounted value is a 
martingale under the EMM. This is important, because even the Black-Scholes model 
has "doubling" strategies, and is not arbitrage-free if arbitrarily large losses can be 
sustained. However, we will ignore these technical points in what follows. 

□ 

Definition 7.1.3 A (European) contingent claim C is said to attainable if and only if there 

exists a self-financing strategy (pt such that Ct = Vt{4>) (where T is the exercise date of the 
claim). Then (p is called a replicating portfolio for C. 

A market model is complete if and only if every contingent claim is attainable. 



128 



Stochastic Financial Modelling 



□ 

Proposition 7.1.4 (Numeraire) 

A self-financing portfolio remains self-financing under a change of numeraire. 

Prom one point of view, this seems totally obvious: After all if we don't add or subtract 
funds from our portfolio when we reckon in units of money, we don't add or subtract funds 

if we reckon in units of barrels of oil either. However, our definition of self-financing is that 
d{4it • St) = (pt • dSt- Now suppose that we reckon in terms of a new numeraire Nt. Let 
St := jf-^ be the price of S in units of N. To prove that the self-financing condition holds, we 

must show that d{^t ■ St) = <pt ■ dSt (i.e. that Gt(0) = Jq • dSu), and this no longer seems 
so obvious. 

Proof: Let Vt = Then by Ito's formula 

<iV4 = idv; + F,<,(i)+<i[v,ii, 

= |-<iS. + AS,d(i.)+.^..d|S,il 
because (by the self-financing condition) dVt = 4>t ■ dSt, so d[V, jf]t = <pt ■ d[S, jj]t. Thus 

dVt = cPf(^/St + Std(l-)+d[S,^[ 



(pt d 
^tdSt 



St_ 
Nt 



□ 



Corollary 7.1.5 If a contingent claim is attainable in a given numeraire, it is also attainable 
in any other numeraire, and the replicating portfolio is the same, i.e. if 

X = Vo+ [ (t>fdSt then X = %+ I (f)t ■ dSt 
Jo Jo 



□ 

Proof: Suppose that (p is a self-financing strategy that replicates X, so that X = Vricf)) = 
Vb + Gt{4>) = ^0 + /o 4>f dSt. Since we have shown above that c/((/> • St) = (pt ■ dSt, we obtain 



X = Vt{cP) = Vb + Gt{cP) = Vo+ r ct>f dSt 

Jo 



□ 



In particular, if the numeraire is the bank account, then 

ft 



Vt{^) = Vo{(P) + f <Pu dSu 
Jo 



Financial Modelling in Continuous-Time 



129 



Remarks 7.1.6 A self-financing portfolio (/) = {(fP , . . . , (f)^) is completely determined by the 
A'^ of the + 1 components. Thus, e.g., if we take the bank account to be the numeraire, 
and if we are given the risky asset components (/>^, . . . ,(^^, the value of the riskless asset 
component (fP is completely determined by the self-financing condition: Take 5° to be the 
numeraire, so that 

^fSt = vM) = v^{<t>) + Y, K dK = %{<!>) + Y, Kds: 

because dS^ =0 — 5° = 1 is constant. Hence 



n=l 



This is important, because it means that, given a portfolio of risky securities ',(^"), 
we can make it portfolio self-financing simply by adjusting the bank account. This will not 
affect the discounted gain of the portfolio at all, as dS^t = 0. The same goes if we take 5*^ to 
be a numeraire other than the bank account. 

□ 

7.1.2 Martingale Pricing 

Recall that an arbitrage strategy is a trading strategy cp with the properties that 

• Vb ((/>) = — initial cost is zero. 

• P(Gt > 0) = 1, i.e. zero probability of a loss. 

• F{Gt > 0) > — positive probability of a profit. 

Exercise 7.1.7 It is often convenient to use a slightly different definition: Let A" be a 
numeraire. Show that there is arbitrage if and only if there is a portfolio (j) such that 
P(Gt(</') > 0) = 1 and P(Gt(<^) > 0) > 0. 

□ 

Definition 7.1.8 Suppose that N is a numeraire. A measure Q on (0,7^) is an equivalent 
martingale measure (EMM) for numeraire N if an only if 

(i) P; 

(a) S = {j^)t is a (local) Q-martingale. 
If St is a Q-martingale, Q is called a strong EMM. 

An EMM associated with the money market account is called a riskneutral measure. 

□ 



130 



Stochastic Financial Modelling 



If A/" is a numeraire, define Vt{(t>) = "^lyf^) ^'^id define 

Jo 

Note that if Q is an EMM for A'^, then both V and G are Q-local martingales. Indeed, 
Gt = Jq (t^u dSu is a sum of stochastic integrals w.r.t. a Q-local martingale. 

We require Q to be equivalent to F so that both measures have the same arbitrage strate- 
gies: F{Gt > 0) > if and only if Q(Gt > 0) > 0. 

Further note that (ignoring some technical conditions): 

• An arbitrage opportunity remains an arbitrage under 

— a change of equivalent measure; 

— a change of numeraire. 

• A replicating portfolio remains a replicating portfolio under 

— a change of equivalent measure; 

— a change of numeraire. 

Theorem 7.1.9 If an EMM Q exists (for some numeraire N), then there are no arbitrage 
opportunities. 

Proof: If ^ is a self-financing strategy, then 

= Go{(l}) = ^qilGrm 

Now because F{Gt > 0) > if and only if Q(G't > 0) > 0, and because > if and only 
if Gt > 0, we cannot have both Gt > and Ep[Gt] > 0. Thus cannot be an arbitrage, i.e. 
there are no arbitrage opportunities. 

□ 

Example 7.1.10 The most common choice of numeraire is the money market account. Sup- 
pose that S^ is the MMA, with price dynamics 

dS^ = r{t,u)S^ dt 

If Q is the EMM associated with 5°, then each 5" is a Q- local martingale. Now 

J on an 

and so 

dS"^ = S^dSl' + rSl' d.t = rSl' dt + dMf 

where M^" = £ dSf is a Q-local martingale. Conversely, if each dS^ = rSf dt + dM" for 
some Q-local martingale M", then Q is a riskneutral measure. 

□ 



Financial Modelling in Continuous-Time 



131 



Theorem 7.1.11 (Martingale Valuation) 

Suppose that X is an attainable contingent claim, and that Q is an EMM for numeraire N. 
Then 

It = EQ[Xr|Ji] 



I.e. 



Xt = NtE^ 



Nt 



Proof: If replicates X, it does so under any numeraire, any EMM. Now by the Law of One 
Price, if Xt = Vt(^A), then Xt = Vt(^) for all t < T. (Else buy the cheaper of the two and 
short the more expensive one at time t, and pocket the difference. At time T your gains will 
exactly match your obligations.) Thus 

Xt = Vt{^) = EQiVrmJ't] = MXrlJ't] 



□ 



7.2 The Generalized Black-Scholes Model 

As usual, we work in a probability space (O, J", P). We assume that all information is contained 
in a filtration {Tt)t, which is generated by a i^-dimensional standard P-Brownian motion 
Wt = {Wt, . . . , W/^), augmented to satisfy the usual conditions. We further assume that there 
are risky assets whose price processes St = {Sl, ■ ■ ■ S^^) are continuous semimartingales, 
indeed Ito diffusions, with dynamics of the form 

K 

dS^ = iin{t)Sl' dt + S^ ^"'^(*) 



k=l 



or 



'dSl 



.dS?. 



V 








dt + 




(TlK{t)\ 



ydWt^, 



i.e 



dSt = D[St]ii{t) dt + D[St]cj dWt 



where /x is a the drift vector, a the volatility matrix, and D[St] the diagonal matrix with asset 
prices along the diagonal. 

We further assume that we have at our disposal a money market account At with dynamics 



dAt = r{t)At dt Ao = l 



The money market account (MMA) will serve as our numeraire. 



132 



The Generalized Black-Scholes Model 



7.2.1 Construction of a Risk-Neutral Meeisure via Girsanov's Theorem 

Let T be the horizon, i.e. we are only interested in the time interval [0,T]. Suppose that Q 
is an EMM for the MMA, so that St = is an A^-dimensional Q-martingale. This means 
that the drift of each risky asset must be r under Q, i.e. when we change the measure form 
P to Q, the drift of 5" must change from /x„ to r. 

To accomplish this change of measure via a Girsanov transformation, we need to find a 
kernel A(t) G R'' such that a{t)X{t) = r{t) - fi{t) (where r now doubles as the column vector 
whose entries are r). If we can find such a A, then Girsanov's Theorem tells us that we can 
construct Q as follows: Let 

^ = £{J X - dWt)T = e^o M*)- dWt-l /o" IIAWlp dt 

and define Q by 

Q{F) = j idF 

Then ^ 

Wt = Wt- [ X{u)du 
Jo 

is a X-dimensional standard Q-Brownian motion. 
The dynamics of St under Q will therefore be 

dSt = D[St]{fiit, St) dt + a{t, St){dWt + A dt)) 

= D[St]{rit) dt + ait,St) dWt) 

Hence the drift of each asset is indeed r under Q, so that Q is a riskneutral measure. 

Thus in order to construct a riskneutral measure, it is necessary that we are able to solve 

aX = r — fj, 

for A. The above is a system of N linear equations in K unknowns, and will generally have 
a solution if < i^. li N > K, then the system is overdetermined, and a solution will 
only exist in special circumstances. However, if a solution A does not exist, then we are 
unable to construct a riskneutral measure, and this means that there is arbitrage (see the 
next subsection). It follows that the force of arbitrage in the market will force those special 
circumstances to hold. 

Example 7.2.1 Suppose we have a Black-Scholes model with two risky assets but only one 
source of noise, i.e. N = 2, K = 1: 

dSl = niSl dt + aiS] dWt 

dSf = H2St dt + (71 S| dWt 

To find a riskneutral measure, we seek a Girsanov kernel A solving 

Here A is a number (because K = 1). A solution will only exist if 

r- m _ r - H2 

(71 (72 

as you will easily verify. 



Financial Modelling in Continuous-Time 



133 



□ 

Remarks 7.2.2 The Girsanov kernel A is very closely related to a quantity called the market 
price of risk. Consider a Black-Scholes model with only one source of noise, as in the example 
above. The we can only solve the system crA = r — /x if 

A = - — — for all n 

The negative of this quantity, 

-A=^ 
a 

is called the market price of risk. We therefore see that, for there to be no arbitrage, all assets 
must have the same market price of risk. 

The reason for the name market price of risk is as follows: ;U — r is the excess rate of 
return of the asset (above the risk-free rate). Thus the ratio u = can be interpreted as 
the excess rate of return pr unit of volatility. 

In the case that we have more than one source of noise, each source can be ascribed its 
own market price of risk: The MPR of noise source VF^ is simply —Xk, the negative of the 
k^^ component of the Girsanov kernel. We have 

O-nl(-Al) + . . . CrnKi-Xx) = Hn - r 

so a slight increase e inthe volatility cr„fc corresponding to noise source will result in an 
increase {—Xk)£ in the excess rate of return. Thus — Afe can be regarded as the excess rate of 
return caused by a unit change in the volatility coresponding to the k^^ source of noise. 

□ 

7.2.2 No— Arbitrage and the Existence of a Risk— Neutral Measure 

We will sometimes denote the dot product of two vectors x,y by 

x-y = x^^'y = {x, y) 

where .x*'' denotes the transpose of x. Observe that if A is a matrix, then (x. Ay) = {A*^x,y) 
(because (A*''y)*'",T = Wc recall here a lemma from linear algebra: 

Lemma 7.2.3 If a is an n x d-matrix (i.e. a linear operator ct : M'^ ^ M"j, then 

(kero-)-^ = ran (ct*'') 

Proof: X G kera =^ {y,o-x) = (a*''?/, x) = for all y G M". So a^^y ± keia for all 
y G M", i.e. ran((7*'") C (kercr)-*-. As dimran((T) + dimker((7) = d = dim(ker cr)-'- + dimkercr 
we see that dimran((T*^) = dim(ker cr)-'-. Hence ran(cr*'') = (ker cr)-*-. 

H 

Lemma 7.2.4 // P : M" — ^ M" is the orthogonal projection onto a subspace V C M", then 
{Px,y) = {Px,Py) = {x,Py). 



134 



The Generalized Black-Scholes Model 



Proof: If y = Py + y-^ is the orthogonal decomposition, then {Px,y) = {Px,Py + y-' 
{Px,Py). 



Suppose now that dSt = D[St]{iJ-t dt + at dWt], where the asset price process S is A^- 
dimensional, and is a i^-dimensional standard Brownian motion. The volatihty process a 
is a N X iC-matrix process. Let P : — t- be the orthogonal projection onto the subspace 
kero"*'^. Define 

p{t) := P{ii{t) - rl) 

We omit the (technical) proof that p{t) is measurable. Further, for n = 1,...,N define 
portfolio components 

Pit) 



and choose 9^ to make the portfolio self-financing with inital value 0. Then 



Mms? 

else 



Grie) = r Of dSt 
Jo 



WpMI Jo ra^^ipW^o}^^* 

\ llfWiP,, 

T 

I{pit)m\\P(*)\\ 





because p{t) ■ {fi — rl) = p{t) • p{t) = \ \p{t)'^\\, and because a ■ p{t) = 0, as p{t) G ker((T*''). 

Thus Gt{0) > a.s. unless p{t) = a.e., i.e. unless — rl G (ker o"*'')-'- = ran a. Now 
if there is no arbitrage, then we cannot have Gt{0) > 0, and so there must be A such that 
aX = fi — rl. 

Given such a A, the previous subsection shows that ^ = 8{—X»W)t defines a risk-neutral 
measure. We have therefore proved: 

Theorem 7.2.5 (Fundamental Theorem of Asset Pricing for the Generalized Black-Scholes 
Model) 

The Generalized Black-Scholes Model is arbitrage-free iff and only if there is a risk-neutral 
measure. 



7.2.3 Hedging of European Contingent Claims 

We can use Thm. 7.1.11| to price a European-style contingent claim X only when we know 
that X is attainable. To find a replicating portfolio so that Vt{0) = Xj- may be very 
difficult, however. Provided that — roughly speaking — the underlying assets contain all 
the information, the Martingale Representation Theorem may overcome this difficulty, by 
showing that all contingent claims are attainable. 

Suppose we have an arbitrage-free SSM (0, J^, P, {J-'t)o<t<T, {Sf, . . . , S^)o<t<T)- We as- 
sume that all information is contained in the filtration {J-'t)t is generated by a ET-dimensional 



Financial Modelling in Continuous-Time 



135 



standard P-Brownian motion Wt = {Wf, . . . , W^^), (augmented to satisfy the usual condi- 
tions). We further assume that the N risky assets whose price processes St = {Sj, . . . S^) are 
Ito diffusions, with dynamics of the form 

dSt = D[St]^lt dt + D[St]c7t dWt 

where /x is a the drift vector, a the volatility matrix, and D[St] the diagonal matrix with 
asset prices along the diagonal. Because the model is assumed to be arbitrage-free, we 
have at our disposal at least one risk-neutral measure Q, which is obtained from P via a 
Girsanov transformation with kernel some i^T-dimensional predictable process A. The process 
is a Q-Brownian motion. The discounted asset dynamics under Q are 

thus 

dSt = D[ST](Tt dW^ 

Suppose also that we have a european-style contingent claim X, with payoff at expiry 
T. Define a (Q, (J^t)t)-martingale {Mt)t<T by 

Mt:=EQ[XT\Tt] 

By the martingale representation theorem, there is a i^T-dimensional predictable process Ht 
so that ^ 

Xt = Mt = Eq[Xt]+ [ Hf dW^ 

Jo 

But dSt = D[ST]o't dW^. If we can find a left-inverse a for the matrix a, we would get 

dW^ = atD[St]-^ dSt 

Now put 

HtatD[St]-^ 

be the risky-asset component of a portfolio 9, i.e. 

1 ^ 

(^t=^T.^knm^ n = l,...,N 

k=l 

and choose 9^ to make the portfolio self-financing with initial value Eq [Xt] ■ Then 

Vt{0) = Vo{9) + Gt{9) 

= Eq[Xt]+ [ 9tdSt 
Jo 

= Eq[Xt]+ f HtatD[St]-^D[St]atdW^ 
Jo 

= Eq[Xt]+ [ HtdW^ 
Jo 

= Xj^ 

Thus ^ is a replicating portfolio for X. 



136 



The Generalized Black-Scholes Model 



It follows that if a has a left inverse, then any contingent claim is attainable, i.e. the 
market is complete. We now give a rough, intuitive argument for when we will be able to 
replicate an arbitray X. In the above, we require a so that 

Eq[Xt]+/ HfdW^ = ¥.q[XT]+ [ BfdSt 
Jo Jo 

so we require 

Ht ■ dW^ = Ot ■ D[Sta ■ dW^ 
This means that we must solve, at each instant, the linear system of equations 

H = e- D[S]a 

for the risky components . . . , 0^ , i.e. we have a system of K equations in unknowns. 
Roughly speaking, this means that we will have a solution as soon as there are more variables 
than constraints, i.e. when N > K. We expect the solution to be unique when K = N . 

To find a risk-neutral measure, we must find a Girsanov kernel, i.e. a solution A to the 
linear system aX = r — ii. This is a system of N equations in K unknowns, and, roughly 
speaking, will have a solution when K > N, and a unique solution when K = N. 

Thus: If K < N, then we expect to be able to hedge every contingent claim in more than 
one way. We therefore expect there to be arbitrage, as different replicating portfolios need 
not have the same value. Thus we don't expect a risk-neutral measure to exist. On the other 
hand, if K > N, we expect that there will be many risk-neutral measures, and there will be 
contingent claims that cannot be replicated. These unattainable contingent claims may have 
different prices under different risk-neutral measures. 

The sweet spot is therefore K = N — as many sources of noise W as assets S: We expect 
that a unique risk-neutral measure exists, and that every contingent claim has a unique 
replicating portfolio. For that, we require a to be invertible. 

7.2.4 European Vanilla Call and Put Options 

The following formula is basic for "lognormal pricing" and underlies not only the Black- 
Scholes formula for call- and put options, but also Black-type formulas for futures options, 
Margrabe options, caps and floors, swaptions, etc. 

Proposition 7.2.6 If X is lognormal, with InX ~ N{ii,s^), then 

E[(X - K)+] = E[X]iV(cZ+) - KN{d-) 

where 

1„ EX _|_ 1 „2 

^ K ± 2^ jEX = e'^+i^' 

s 

Proof: Let Z := ^^^4^ so that X = e^'+'^ and Z ~ iV(0, 1). Clearly EX = e^'E[e'^] = 
e^^2*' J using the moment generating function of a standard normal random variable. Then, 



Financial Modelling in Continuous-Time 



137 



using the symmetry of the standard normal random distribution, 

E[iX - K^] = ^ / (e^^'^ - K)e-^ dz 
V27r 



;= / e 2 dz 



/•oo 



s s 

= E[x]P(z < 1^=^ + s) - ii:p(z < M=i^) 



E[X]P(Z < l^VK_±l^j - KP(Z < 



1 2 

We prefer to use the expression E[X], rather than , in the above formula, because 

very often X will be the terminal value of some martingale, in which case E[X] is its initial 
value, and needs not be calculated as it is already known. 

The Black-Scholes formula for vanilla call options now follows easily: Given (one-dimensional) 
risk-neutral asset dynamics 

dSt = St{r dt + a dWt) 

with constant riskless rate r, we obtain discounted dynamics dSt = Stcr dWt, so that St = 
Sqc^^"' T+aWr _ Y^QYicc St is lognormal with In^^ ~ N{—^a^T,a^T) under the risk-neutral 
measure Q. Also, since the discounted asset price process St is a martingale under the measure 
Q, we have Eq[5t] = Sq. 

Now let C be a european call option with strike K and expiry T on the underlying asset 
S. By the above lognormal formula, and the risk-neutral valuation formula, the t = 0-price 
of the call is given by 



Co = Eq[(St - K)+] = Eq[{St - Ke-^^)+] = SoN{d+) - Ke-'-^Nid^) 

where 

^ _ In ^ ± la'T _ In f + (r ± ^a^)r 

The fact that {St - K) = {St - K)+ - {K - St)^ — so-called put-call parity — allows 
us to easily calculate the price for a european put option P: 

Po = e-''^EQ[(if - 5r)+] = e-^^¥.q[{ST - K)+ - {St - K)] = Co - -So + Ke'"''^ 

Rearrangement and symmetry of the standard normal distribution now yield 

Po = Ke-''^N{-d-) - SoN{-d+) 



7.2.5 Caveat: Arbitrage in the Black— Scholes Model 

Throughout, we have made certain simplifying assumptions to keep the technical machinery 
to a minimum. For example, we have assumed that local martingales are martingales, which 
may not be the case. We give here a pathological example of arbitrage in an "arbitrage-free" 
model. It is related to the doubling strategy in gambling: Bet 1 on the toss of a coin. If you 
lose, bet 2 on the next toss. If you lose again, bet 4 on the next toss. With probability 1 you 



138 



The Generalized Black-Scholes Model 



will eventually win. If this is on the (A'^ + 1)*'^ toss, you will win 2-^, whereas your losses up to 

that time wih be 1+2+4H h2^~^ Hence your total gain is 2^-(l+2+. . . 2^"^) = 1. Thus 

a single win suffices to recoup all previous losses, and you are guaranteed to win eventually. 

In the strategy below, we increase the bet on the stock when the stock price goes down. 
Eventually, the stock price will move up, recouping all our previous losses. 

Exercise 7.2.7 Arbitrage in a simple Black—Scholes Model 

Consider a financial markets model (Q, T ^ P, J-t, (S't, Ai)) with a single risky share S (in addi- 
tion to the riskless bank account A)^ where r = = and cr = 1, i.e. the dynamics are given 
by 

dSt = St dWt dAt = 

where Wt is a one-dimensional standard Brownian motion. Note that P is a risk— neutral 
measure, because St, At are P-local martingales. Define 

(a) Show that [I]t = In ^ for t G [0,T). 

(b) Let g{s) = T(l — e~*) for s G [0, oo), and define Xg = [I]g(s)- Use Levy's characterization 
to show that {Xt)t is a Brownian motion. 

(c) Deduce that limsup^2^/(t) = oo a.s. and liminft-|-r /(i) = — oo a.s. 

(d) Now let a > 0, and define a stopping time Tq by 

Ta = inf{t : It = a} 

Explain why < < T a.s. 

(e) Define a portfolio (p = {(p^, tp^) by 

and adjust ip^ to ensure that (p is self-financing with initial value Vo{(p) = 0. (Note that 
(pf increases if St decreases, and also as t — >■ T.) Show that Vt{(p) = a a.s. 

□ 

In the above exercise, we saw that there are arbitrage strategies in the Black-Scholes 
model. We now show that there are no admissible arbitrage strategies. A portfolio 9 is said 
to be admissible if and only if there is a constant C < such that Vt{6) > C for all < t < T 
— the portfolio may not fall into a debt which is > \C\. This is a realistic assumption, as 
your broker or creditors will close out your position if it becomes too negative. 

Exercise 7.2.8 Let Q be a risk-neutral measure (so that St is a Q-local martingale). 

(a) Use Fatou's Lemma to show that a non-negative local martingale is a supermartingale. 

(b) Suppose that 99 is a self-financing trading strategy such that Vo{ip) = and Vt{(p) > C 
for <t <T and a constant C > —00. Show that Vt{ip) is a supermartingale. 

(c) Now conclude that ip cannot be an arbitrage. 

□ 



Financial Modelling in Continuous-Time 



139 



7.3 The Black-Scholes PDE 



Consider a market model that is complete and arbitrage-free, so that there are as many 
sources of noise as there are risky securities. We want to price contingent claim C with payoff 
^{St) at expiry T. We assume that the price Ct of the contingent claim at some prior time 
t is given by a sufficiently smooth function Ct = F(t, St) : M"*" x — )• M. For there to be a 
fair or rational price at all, we must assume that the market is arbitrage-free to begin with, 
and that the addition of the new security V does not introduce any arbitrage opportunities. 

Assume that the risky asset dynamics are given by the usual multidimensional SDE dSt = 
D[St]i^ dt + D[St]a dWt (where St = iSl...,Sl^) andW is an A^-dimensional Brownian 
motion), and that the MM A satisfies dAt = rAt dt. 

Now form a portfolio V consisting of one derivative C, as well as a combination of risky 
assets and the MM A. The aim is to make the portfolio locally riskless. An arbitrage argument 
then shows that the portfolio must have the same return as the MMA, and this will allow us 
to derive a PDE. 

Let Wn{t) be the relative weight of asset 5*" in the portfolio V , wa the weight of the MMA, 
and Wc the weight of the contingent claim, so that 



dy_ 



o!5" dA dC 
Wn-^ + WA—r + wc- 



5" ^ A " C 

n 

as you can easily verify. Using the asset dynamics above, as well as 

d^c 



(*) 



dC 

dC = — dt + 



^ dC 



n=l n,m 

= iicC dt + acC dWt 



where 



1 

C 



N 



n=l 



dt 



2 

n.m 



qn qm 



d^c 



dS^dS^ 



n=l 



dc 

'dS^ 



where (T„ = ((T„i, . . . , <T„Ar) is the n^^ row of the volatility matrix. 
Before we put this into (*), note that 



WA 



1 - {wc + ^ Wn) 



because weights sum to 1. Plugging this into (*), we obtain 

N 



dV 
V 



«^n(Mn - r) + wcil^c -r) + r 



.n=l 



dt + 



N 

WnCTn + WcCTc 

n=l 



dWt 



In order to make this portfolio locally riskless, we must choose the weights so that 

N 



XI + WCOC = 



n=l 



140 



The Generalized Black-Scholes Model 



This is a system of N linear equations in the N + 1 unknowns wi, . . . , u;jv, wc- 
Assuming that this system can be solved, we now have 



dV 



N 



Wnifin -r) + wcinc -r) + r 



.n=l 



dt 



Now let f5 be some positive constant, and let's attempt to get a riskless return of r + ^ on 
the portfolio V, i.e. 



JV 



XI ^n(/^n - r) + Wcinc 



n=l 



so that 



N 



X] WnifJ-n -r) + wcil^c -r) = P 



n=l 



Combining this with (**), we get a linear system 



tr 



Wn 







I.e. 



H 



ws 
wc 



Thus to get our excess riskless return of /3 we need to solve a system of + 1 equations in 
N + 1 unknowns. A solution will exist if and only if det{H) ^ 0. 

Now in an arbitrage-free market, it is impossible to obtain a riskless return above the 
risk-free rate. Thus if the market is arbitrage-free, the matrix H must be singular, i.e. be 
non-invertible, i.e. have zero determinant. The same is true for iJ**", the transpose of H. If 
we consider the transpose 

I Hi-r (Ti \ 

IJi2-r 02 

H^' = : : 

UN -r GN 

\lic - r acj 

then the singularity of H^'^ implies that its columns are not linearly independent. Thus the 
first column of H^^ can be expressed as a linear combination of the other columns, i.e. there 
exists u = [ui, . . . , unY^ such that 



au = iJ, — r acu = jxc — f 

wheer /x = (/xi, . . . , //jv)*'', etc. Clearly the Un are simply the market prices of risk correspond- 
ing to the sources W^, . . . , M^", i.e. —u is the Girsanov kernel effecting the change of measure 
from real world to riskneutral. Now is a is invertible, we must have 

u = (T~^{ii — r) and thus ^ic — r = accr'^ili — r) 



Financial Modelling in Continuous-Time 



141 



But 



and so 



Now 



n=l 



dC , 



1 i5C c^NdC 



IJ-c 



1 

C 



N 



dt 



n=l n,m 

which yields the generaHzed Black-Scholes equation: 



+ E ^^'^IS + I J2{^<^'1S-S-,£S^ -rC = 



dt 



7.4 Correlated Brownian Motions 

When many assets are available in the economy, it is unrealistic to assume that these are 
all driven by only one source of noise. It would be equally unrealistic, however, to assume 
that all are driven by separate, indpendent, Brownian motions. Thus it becomes necessary 
to generate multiple correlated Brownian motions. 

Let's first consider a simpler case, where we arc trying to generate not correlated Brownian 
motions, just correlated normal random variables, i.e. suppose that we want to generate mean 
zero normal random variables Xi,... with a specific covariance matrix S = {(Jij). Here 
Uij = Cov{Xi,Xj). 

You can check the following simple 

Fact: If (Xi, . . . , X„) is a random vector with covariance matrix S and if A is an nxn— matrix, 
then the random vector 







fxA 















has covariance matrix AEA^^. 

Indeed, we may assume without loss of generality that E[X] = 0. Then Cov(Y) = 
E[YY*''] = y4E[XX*'']^*^ 

□ 

Now recall that a matrix C is said to be positive semidefinite if and only if x^'^Cx > 
for all vectors x. Covariance matrices are necessarily symmetric positive semidefinite, since 
if E = Cov(X), then x*'"Si; = Var(x*^X) is the variance of the random variable x*''X, and 
hence non-negative. 

It is known that symmetric positive semidefinite matrices have a Cholesky decomposition, 
which means that it is possible to find a (real) lower triangular matrix C such that 



CC 



tr 



142 



Correlated Brownian Motions 



Note that if C is an arbitrary matrix, then CC*^ is necessarily symmetric (obvious), and 
positive semidefinite: If a; is a column vector, then x^'^C is a row vector, with length given 
by ||f*^C|p = (x*''C)(x*''C)*'' = x*''CC*^x. Since the length of a vector is necessarily non- 
negative, CC*^ is positive semidefinite. 

Thus any matrix A that can be written as ^ = CC^^ is necessarily symmetric positive 
semidefinite. By the Cholesky decomposition, the reverse is also true. Indeed we can easily 
find a lower triangular C which does the trick. There is no deep mathematics behind this — 
we merely need to solve 



/ an ai2 

(l21 0,22 



0'n2 



ain\ 

a-2n 
O-nn) 







•11 



C21 C22 



\Cnl C„2 



\ 




/cii 




C21 
C22 



Cnl\ 
Cn2 



7 



This system is easily solved: cf^ = a\\ gives us c\\. c\\C2\ = ai2 now gives us C21, etc. 

There are fast algorithms available for calculating Cholesky decompositions. 

Now suppose we are able to generate independent standard normal random variables 
Zi, . . . , Zn- These have the identity matrix as covariance matrix. Define a random vector 
X = CZ. Then the covariance matrix of Z is simply C/C*'" = CC*'' = S. Thus to get a vector 
X of mean zero normally distributed random variables with covariance matrix S, proceed as 
follows: 

• Generate a vector Z of independent standard normal random variables (of the same 
dimension as X). 

• Find the Cholesky decomposition S = CC*'' of the symmetric positive semidefinite 
matrix S. 



• Put X = CZ 

Note that if S = 



{(Tij) is a covariance matrix, then the correlation matrix is given by 



Pij 



33 



Clearly the correlation matrix is also symmetric. 

Now to obtain correlated Brownian motions Wf, . . . , W^, we can proceed in a similar way. 
But first: What exactly do we mean if we say two Brownian motions W^, W'^ are correlated? 
Clearly this has meaning if we speak about changes in the processes. If W^,W'^ are highly 
correlated, then we expect a positive change in to be accompanied by a positive change 
in 

Now suppose that we have independent standard Brownian motions B}, . . . , Bf. Consider 
a matrix V = with the property that all the rows of T have unit length. Define 







(Bl\ 




= r 











Financial Modelling in Continuous-Time 



143 



so that each W^; = J^jlij^t is a Unear combination of Bf's. It follows that each is a 
continuous local martingale. Now 




j 

= t 



because [5^ , S'^jt = 5jkt, and J2j ifj — ^ (being the square of the length of the i*^ row of F).. 
Hence, by Levy's Characterization, each is a Brownian motion. Now 

k,l 

which we may also write as 

dw^ dwi = (rr*^)ij- dt 

Hence 

E[wiwi]=E[ [W\W^]t] = {Tr'^)ijt 

Thus W^,W^ are correlated Brownian motions, and the correlation between Wl and is 
simply (rr*^)jj, independent of t (because the variance of each Wl is just t). 

Note that if S, p are, respectively, the covariance and correlation matrix of {W^ , . . . , W^"), 
then S = pt. Hence p is also symmetric positive semidefinite, and thus has a Cholesky 
decomposition p = FF*^. 

Further note that not every symmetric positive semidefinite matrix can be the correlation 
matrix of some multidimensional Brownian motion: Since the correlation of a random variable 
with itself is 1, it is necessary that a correlation matrix has I's down the diagonal. This, in 
turn, implies that the Cholesky decomposition matrix F will have row vectors of unit length. 

Hence, to create correlated Brownian motions with correlation matrix p, proceed as fol- 
lows: 

• Find the Cholesky decomposition p = FF*''. F will have rows of unit length. 

• Define W = FB, where B is a multidimensional standard Brownian motion (with 
independent component processes). W will be a multidimensional Brownian motion 
with correlation matrix p. 

One final remark about differential notation: Since FF*'' = p, and since dW^ dW^ = 

(FF*'')ij dt, we have 

d[W\ W^t = dWi dWi = pij dt 

Example 7.4.1 To create two correlated Brownian motions W^^Wf with correlation p (a 
number, not a matrix), proceed as follows: The correlation matrix is 




144 



Correlated Brownian Motions 



Its Cholesky decomposition is found by solving 

C (2') <-^^") 

for a,b,c. (Recall that F is lower triangular.) Thus a = l,b = p,c = \J\ — p^. Gratifyingly, 
the rows of F arc seen to possess unit length. 

Finally, if B\^B\ are standard independent Brownian motions, then 

= B\ 

Wf = pB] + ^l-p'Bl 
are Brownian motions with correlation p. 

□ 

Example 7.4.2 Suppose we have asset dynamics 

(dSl\ _ (0.3Sl\ ,..(0.1Sl 0.45A (dWl\ 
[dSiJ - [0.2S^J + [oASi 0.3Si) [dW,^) 

where , are independent Brownian motions. Here each asset is driven by two sources 
of noise. It may be convenient to rewrite the dynamics in a decoupled fashion: 

dSl = O.SSl dt + aiS} dW^ 
dSf = 0.2Sf dt + a2Sf dWf 

where W^^Wf are correlated Brownian motions. This may be simpler, because each asset is 
now driven by only one source of noise. 

The two things that we need to know are: 

(i) What are the volatilities (7i,(T2? 

(ii) What is the correlation p between and W^l 
Clearly, we must have 

ai dW} = 0.1 dW/ + 0.4 dW^ 
a2 dW^ = 0.4 dW^ + 0.3 dW^ 
Looking at the covariance processes, we must have 

al dt = (0.1^ + 0.4^) dt 
al dt = (0.42 + 0.3^) d.t 
(Ti 6-2/9 dt = (0.1 X 0.4 + 0.4 X 0.3) dt 

which are three equations in 3 unknowns, easily solved for (Ti,(T2,p: 

ai = 11(0.1,0.4)11 
^2 = 11(0.4,0.3)11 

(0.1,0.4) • (0.4,0.3) 
||(0.1,0.4)||. 11(0.4,0.3)11 



Financial Modelling in Continuous-Time 



145 



Note that the vectors on the right can all be read off the volatility matrix. 
Thus 

11(0.1,0.4)11 

^,_ iOA,0.3)-iWi,W,^) 
11(0.4,0.3)11 

It is clear that W^, are continuous martingales. Moreover 

[W% = t = [W% 

so that W^jW^ are indeed Brownian motions (by Levy's Characterization). Furthermore, 

[W\W% = pdt 

as expected. 

The above example can be generalized: 
Proposition 7.4.3 Give a multidimensional SDE dXt = b{t, Xt) dt + a{t, Xt) dWt, i.e. 

(dWl\ 



□ 



(dxl\ 


(h\t,Xt)\ 


( 






dt + 


\dXfl 


\h-{t,x)) 


V 



/au{t,Xt) ... ai„,{t,Xt)\ 

\anl{t,Xt) ... anmit,Xt)/ 



\dwrj 



where Wt = {W^, . . . , W"^)t is a standard m-dimensional Brownian motion. Let ai be the i*'* 
row of the matrix a. Define 



J j ai-Wt f ■ -, 

= II II fori = l,...,n 



Then (by Levy's Characterization) the are n correlated Brownian motions, with correlation 



Pij 



and we have dynamics 

dXi = b\t,Xt) dt+\\ai{t,Xt)\\ dW^ fori = l,...,n 
Here each X* is driven by only one source of noise. 

Thus the "volatility" of a one-dimensional process of the form 

dXt = ndt + ai dW^ + ■ ■ ■ + an dW 



□ 



IS 



a 



[ai, ...,an) 



146 



Change of Numeraire 



What happens to the Black-Scholes PDE when we have correlated Brownian motions? 
Recall that this is 

n n,m 

Note that {(Tcr*'^)nm is just cr„ ■ am, the inner product of the n*'* and m*'* rows of a. We have 
seen that 



and thus we obtain 

n n,m 

where (T„ is the volatility of S'^. 

7.5 Change of Numeraire 

7.5.1 Introduction to Change of Numeraire 

Thus far, we've mainly considered two probability measures, the "real world" measure P, and 

the equivalent martingale measure Q for the money market account numeraire. We've seen, 
however, that it is possible to introduce an EMM for different numeraires, and to use these 
for pricing. We now show that a change of numeraire is a technique which often simplifies a 
pricing problem - it is analogous to a reduction in dimension. 

Consider an interest rate derivative X, and let At be the bank account. If Q is the EMM 
for At, then ^ is a Q-martingale (assuming, of course, that X is attainable) .Thus 



Xo 



where r is the short rate, so that At = e-^ '^i^''^) <^*. In order to compute this, we would have 
to know the joint density of X{T), A{T) under Q — it would not be observable, because only 
P-densities can be observed. The computation of the expectation would involve a double 
integral. 

The reason you may not have noticed this problem before is that we have generally assumed 
that interest rates are constant, which simplifies matters considerably. If we assume that the 
payoff X{T) and the short rate are independent under Q, then we would still have some 
simplification, namely 



Xn 



■ So ^i^''^) ^'^ 



= p(0,r)EQ [X{T)] 

where p{t, T) is the time t-price of a zero coupon bond with face value 1 and maturity T, 
i.e. an interest rate derivative with payoff 1 at expiry, in all states of the world. The above 
expression is obviously much simpler: 



• It only involves a single integral, and needs only the Q-density of X{T). 



Financial Modelling in Continuous-Time 



147 



• p{0, T) is observable (either directly, or by bootstrapping a yield curve from observable 
coupon bond prices). 

Generally, of course, X{T),A(T) are not independent under Q. Even if they were independent 
under P, they would nevertheless probably not be independent under Q — under Q, the drifts 
of all assets are the same, namely the short rate. Thus Xt has the same drift as A^, implying 
some correlation. 

7.5.2 Mechanics of Changes of Numeraire 

As usual, we work with a market model (O, T, P, {Tt)t, {S^, . . . , S/^)*)- Recall: 

• A numeraire is a traded asset (posibly a portfolio of assets) with a strictly positive price 
process. 

• Self-financing portfolios remain self-financing under a change of numeraire. 

• Replicating portfolios remain replicating portfolios. 

Now the bank account Sf = At is just a special numeraire — one whose dynamics have zero 
volatihty: dAt = r{t,uj)At dt. Let Q be the EMM for At. Then each ^ is a Q-martingale, 



i.e. Q "martingalizes" the ratios 

Suppose that A{t) is another numeraire, with EMM 



Given that A{t) is a (combination of) traded assets, we expect -jttt to be a Q-martingale as 



"martingalizes" the ratios 

Ait) 



SI 



At 



A{t) 

-martingale, and 4*- is a Q-martingale. 



What does Q look like? Since Q, Q are both equivalent to P, they are equivalent to each 
other, and thus the Radon-Nikodym derivative 



exists. We don't know yet what it is, though, because we don't know 
exists, so we may define the likelihood process 



Nevertheless, it 



We have, by Bayes' Theorem, 



Lit)=EQ[L{T)\Tt] 



An 



so that 



because L(0) = 1. But 



X{T) 
A{T) 



Xo 



Xo 
Ao 



-- L(0)-^E, 
X{T) Ao 



X{T) 



A{T) Ao 



[A{T) 
L{T) 



L{T) 



148 



Change of Numeraire 



This suggests that we turn every thing around and define 

A{T)/A{T) 



L{T) 



and then define Q by 



dq 



m/m 



L{T) 



Then L{t) = , as you can easily check. 

In general, we may use for A{t) absolutely any process with the property that ^ is a 
strictly positive Q-martingale. 

Theorem 7.5.1 (Martingale Measure Pricing) 

Suppose that A{t) is process with the property that ^ is a strictly positive Q-martingale. 
Define 



At /At 
Aq/Aq 



L{T) 



If ^ is a Q-martingale, then ^ is a Q-martingale. In particular, if X is an attainable 
contingent claim, then 



At 



Xt = At¥., 



X{T) 
A{T) 



Tt 



□ 



In fact, we can generalize even more: 

Theorem 7.5.2 Suppose that ai{t),a2{t) are numeraires, and that Qi,Q2 are their associ- 
ated EMM's. Then for any random variable X we have 



X 



«i(T) 



X 



«2(r) 



Proof: Define the likelihood process Lx{t) = 21^o)/^{q) ' define L2{t) similarly. Then by 
Bayes' Theorem 



ai(i)EQ^ 



X 



L«i(r) 



ai(t)Li(i)"^EQ 
X 



X 



ai{T) 



Li{T)\Ft 



= A{t)Eq 



A{T) 



and the same goes for 02- 



□ 



Financial Modelling in Continuous-Time 



149 



Let's investigate how asset price dynamics change when we move from Q-world to Q- 
world. Assuming that asset prices arc Ito diffusions, we have Q-dynamics 

dSt = D[St]rt dt + D[St]at dWt 

where is a (iiT-dimensional) Q-Brownian motion. The Radon-Nikodym derivative (pro- 
cess) which effects the change from Q to Q is 



Lit) 



A{t)/A(t) 



m/m 

Using the fact that At [At is a Q-martingale, we see that 

dAt = nAt dt + atAt dWt 

By Ito's formula, 

^0 



dLt 



[i-tAt dt + atAt dWt^ - ^ [nAt dt) 



Thus 



dLt = Lt&t dWt 

confirming that L{t) is a Q-martingale, as we already knew. Solving the SDE, we obtain 
and thus 



Suppose we change the numeraire from the MMA At to At- Then the 
EMM Q associated with At is obtained from the EMM Q associated 
with At by a Girsanov transformation whose kernel is the volatility a of 
the new numeraire At. 



By Girsanov's Theorem 



Wt = Wt 



a, ds 



is a (iC-dimensional) Q-Brownian motion, and thus the asset dynamics under Q are given by 

(rt + a] ■ at \ 



dSt = D[St] 



\rt + <y? ■ ot) 



dt + D\Sd\adWt 



I.e. 



= (n + • dt + a^S," dWt 



where (j" is the n row of the volatility matrix a. In particular, the "discounted" asset ratios 
St =^ have dynamics 

dsi = 4"(t^r - <yt) dW^ 

as you can easily verify by applying Ito's formula. Hence the are Q-martingales, and the 
volatility of each "discounted" asset is reduced by the volatility of the numeraire. 



150 



Change of Numeraire 



Remarks 7.5.3 Consider a simple Black-Scholes model, where the risky asset prices are 
given by a geometric Brownian motions, driven by a single source of noise. The market price 
of risk of in Q-world is 

r + a'^a — r 



i.e. all assets have the same market price of risk, namely the volatility of the numeraire. This 
is also true in the multidimensional case, where the market price of risk is a vector. 
The bank account has zero volatility, and thus the MPR in Q-world is zero. 

□ 



7.5.3 A General Option pricing Formula 

Consider a call C on a security S with strike K and maturity T. Let Qs be the EMM 
associated with numeraire S, and let Q-^ be the T-forward measure (i.e. the EMM associated 
with the zero coupon bond p{t,T) maturing at T). 



Theorem 7.5.4 



Co = SoQs{St >K)- KP{0, T)([f{ST > K) 



Proof: We have 



But we have 



Co = p(0, r)EQT [max{Sr - K, 0}] 
= p{Q,T)¥.qt[St-K-St>K] 
= p{0, T)Eqt [St; St>K]- Kp{0, T)Q^{St > K) 



ai(t)EQj 



X 

ai{T) 



02 



X 
02 (T) 



for general numeraires and their associated EMM's. Using this with ai{t) = p{t,T) and 
a2{t) = St, we obtain 



p{0, T)Eqt [St; St>K]= p{0, T)Eqt 



StI{St>k} 



P{T,T) J 

StI{St>K} 



St 

= SoQsiST > K) 



□ 



7.5.4 Applications 
Example 7.5.5 Forward Measures 

Consider again the situation at the beginning of this chapter: We consider a contingent claim 
X with expiry T. Under riskneutral valuation, its value is 



Q-!o<s^'^) ds X{T) 



Financial Modelling in Continuous-Time 



151 



where r is the short rate. We bemoaned the fact that this would necesssitate us knowing the 
joint density of At and Xt- If only, we said, At and Xt were independent, we would get the 
much simpler 



Xo=E( 



=- So '■(*''^) 



p(0,T)Eq [XiT)] 



where p{t, T) is the time t-price of a zero coupon bond with face value 1 and maturity T. If 
only. . . 

Now let's see what happens if we change the numeraire to p(t,T), and let Qr be the 
corresponding EMM. In that case, the pricing formula becomes 



Xq 
p{0,T) 



Xt 



and noting that P{T,T) = 1, we have 

Xo=p{0,T)Eq^ [Xt] 

This is the simple form that we sought, but it's correct under <Qt, and not under Q. 

The measure Qt is called the T-forward measure. Note that if interest rates are de- 
terministic, then Q and Qt coincide, because then p{t,T) = e~ ^sds^ ^y^q^i each ratio 
Sf/p{t,T) = AT{S^/At) = const, x S^/At is already a Q-martingale. 

However, when interest rates are stochastic, Q and Q^^ are quite different. Wc shall see 
later that futures prices are Q-martingales, whereas forward prices are Q^-martingales. Thus 
forward prices and futures prices coincide if interest rates are deterministic. 

□ 



Example 7.5.6 Exchange Options 

Consider an exchange option which gives the right, but not the obligation, to exchange asset 
for asset S"^ at time T. This is a contingent claim X with payoff 

Xt = max{5|. - S^, 0} 

Using riskneutral valuation (i.e. MMA as numeraire), its value is therefore 



Xn = 



To compute this, we have to know the joint distributions of At, S^, under Q, yielding a 
triple integral. 

It is computationally simpler to change the mmicrairc: Let At = S^, and let Q be the 
associated EMM. The contingent claim is then priced as follows: 



Xo 

-^0 



IE/ 



msiK{ST — 1,0} 



(where S? = fx)- This looks like a call on S'^ with strike K = 1, and we only have to know 



the distribution of under 



152 



Change of Numeraire 



To price this option, we have to assume some form of asset dynamics. Suppose these are 
given by one-dimensional Ito diffusions, i.e. suppose we have P-dynamics 

dSl = niSl dt + ai dW^t) 
dSl = H2SI dt + CT2 dW^{t) 

where t^p,W^p are corre/afed P-Brownian motions, with correlation pf. To get lognormality 
we also assume that ^i{t),a2{t) and p{t) are deterministic. Interest rates, however, can be 
stochastic. 

We can write this as 

d(^f^ = (^2^2) dt + D[St]atdW^it) 
where l^p , Wp are independent P-Brownian motions and 



a = 



Cll (T12 

C21 C22 



Then we must have 



(71 = ^J(Tj^+(T^ 



'12 

^2 = \/^^i+~a^2 and 

_ 0'llC21 + 0"120'22 

So given ai, 02 and/? we can solve for the matrix a (though not uniquely). 

Note that the correlation is a function of the volatility matrix. When we change from 
P-world to Q-world, the volatility matrix is unchanged, and thus also the correlation. Thus 
under Q, the asset dynamics are 

dSl = rtSl dt + ai dW^{t) 
dSf = rtSl dt + d2 dWlit) 

where Wq, Wq are Q-Brownian motions, with correlation p. This can also be written as 

d(^^^ = D[St\rt dt + D[St\at dW^it) 

where Wq, W q are independent Q-Brownian motions. Now when we change the numeraire 
from the MM A to S^, and the measure from Q to Q, we get 



dSf = Sf{a2i — crn,a22 — (^u) ■ d 

= Sfa2 dW^{t) 
where Wa is a one— dimensional O-Brownian motion and 



02 = V (cr2i - o-ii)^ + (a-22 - 0-12)^ = V + (0-2)2 - 2paia2 



Financial Modelling in Continuous-Time 



153 



Now since the cri(t), cr2{t) and p{t) are assumed to be deterministic, so is a2{t). It follows 
that is lognormal under Q: Indeed 



so that 



ln(52/52)~iv(^-^ a2{tfdt,\j^ a2{tfdt 
Using the properties of lognormality, we see that 

[max{5|, - 1,0}] = E^[5^]iV(di) - lN{d2) 

where 

^^_ ln(^) + i/„^a,(t)^dt 

d2 = C^l - ^2{tf dt 

and thus, using the fact that E|q[5'|.] = Sq, we have 

Xo = SiN{di) - SlN{d2) 
If we further assume that cti, 0^2 and p are constants, we obtain 

Xo = SiN{di) - S'oN{d2) 

ln{§) + ^(aj + ai - 2paia2)T 

where di = ° , ^ „ = 

^/af + ai- 2paia2)T 

d2 = di- y^a-f + a-| - 2paia2)T 

where we used the fact that 

6-2 = a/ (^1)^ + (0-2)2 - 2paia2 

□ 

Note that, in the above example, 7^ d'2{t)^ dt is just the average of the squared volatility 

of 5^ so that Jq a2{tf dt = cr^^erage^- 



154 Change of Numeraire 



Chapter 8 

Interest Rate Modelling 



8.1 Modelling Fixed Income: Introduction 
8.1.1 Classification of Interest Rate Models 

We will examine several approaches for the modelling of interest rates: short rate modelling, 
whole yield curve modelling and market models. The purpose of this section is to introduce 
basic concepts and notation. Amongst the immediately obvious quantities that we may model 
are 

• bond prices 

• the short rate 

• forward rates (discretely or continuously compounded) 

• the entire yield curve 

Of course, a model of bond prices will have the yield curve as an output, etc. These approaches 
are no independent. 

Short rate models: These model just one variable, the short rate, which is an idealized 
quantity that represents the instantaneous interest rate at any time. Usually a diffusion 
model, and thus Markov. We specify dynamics, e.g. 

drt = k{6 — Tt) dt + a dWt k, 6, a const. 

is the Vasicek model, and 

drt = n{0{t) — Tt) dt + a{t) dWt n const., 9, a deterministic 

is the Hull-White (extended Vasicek) model. 

• Can be one-factor or multi-factor 

• Affine term structure models have a particularly simple form, allowing for closed form 
solutions for bond option prices, Eurodollar futures, etc. More later. . . 



155 



156 



Modelling Fixed Income 



• Multi-factor models: principal component analysis shows that 80-90% of the variance 
of the term structure is explained by parallel shifts of the yield curve, 5-10% by a twist 
(long term and short term rates move in opposite directions, pivoting about a point), 
and 1-2% by a butterfly (long and short term rates move in the same direction, with 
mid-term rates moving in the opposite direction). 

Whole yield curve models: These model the entire term structure of rates, eg. the entire 
forward rate curve. Examples are 

• Heath-Jarrow-Morton models 

• Market models 

Interest rate models are often categorized into Equilibrium models and No-arbitrage mod- 
els. Equilibrium models attempt to derive, e.g., short rate dynamics from macroeconomic 
considerations, starting from a representative investor (e.g. Cox-Ingcrsoll-Ross, Vasicek, 
Merton models). These models often have the nice property of being time-homogeneous, 
but usually are unable to fit observed prices exactly. No-arbitrage models attempt to fit a 
model exactly to observed prices and volatilities -zero coupon bonds, caplets, swaptions. (e.g. 
Ho-Lee, Hull- White models). 

Both terms are misnomers: Some equilibrium models are not arbitrage-free, and thus 
not in equilibrium. Some no-arbitrage models permit negative interest rates, thus allowing 
"mattress arbitrage" (borrow from the bank when rates go negative, put under mattress). 

8.1.2 Bond Market Basics 

One of the basic instruments that we shall be concerned with is the following: 

Definition 8.1.1 A T-bond is a zero coupon bond with face value 1.00 and maturity T. Its 
value at time t <T is denoted by p{t, T). 



These are also called discount bonds. 

We work in a probability space (O, J^,P) equipped with a filtration which satisfies the 
usual conditions. We usually require that: 

• p{t, T) is a continuous semimartingale for each T. 

• < p(t, T) < 1 a.s. (This fails in, e.g. Gaussian short rate models) 

• There is a frictionless market for T-bonds of every maturity T > 0. 

• For every fixed T > 0, {p{t, T) : < t < T} is an optional process with P{T, T) = 1. 

• For every fixed t, p(t, T) is (P-a.s.) differentiable in the second variable T. 



□ 



PT{t,T) 



dp{t,T) 
dT 



No default risk. 



Interest Rate Modelling 



157 



Note that, for fixed t, the set {p{t, T) : T > 0} is just the term structure of zero coupon 
bond prices, which is typically a smooth decreasing function (of T). On the other hand, for 
fixed T, the set {p{t, T) it <T} is the price process of the security p{t, T), which is typically 
very ragged (i.e. of unbounded variation). 

Note that there are, in our model, infinitely many securities, namely one p{t, T) for each 
maturity T. 

We briefly recall the definitions of the various types of rate: 
• Let t < S < T. Consider the following strategy: 

(i) At time t, short an S-bond, and use the proceeds to buy 2&^-many T-bonds. 
Net cashflow at time t is zero. 

(ii) At time S, pay $1.00 to redeem the S'-bond. 

(iii) At time T, receive ^jf^ from maturing T-bonds. 

Thus at time T, we can, with no initial cash outlay, ensure that a deposit of $1.00 at 



time S leads to a payoff of ^^^pf) time T. This implies that we can lock in an interest 



p(t,S) 

pitX) 

rate R(t;S,T) for the future period [S,T]: 

p{t,S) 



e 



Rit;S,T) 



pit,T) 
In p{t, T) - In p{t,S) 



T-S 

This is the forward rate (continuously compounded) for the period [S, T] at time t. 

• The equivalent simple forward rate (the LIBOR forward rate) for [S, T] contracted at 
time t is similarly defined by 

l + L(T-5)-^^''^) 



L{t;S,T) 



p{t,T) 
p{t,T)-pit,S) 
p{t,T){T-S) 



• The continuously and simple spot rates at time t for time T are R{t; t, T) and L(t; t, T) 
respectively. 

• The instantaneous forward rate at time t for time T is the interest rate that can be 
locked in for an infinitesimal interval [t, T + dT] . It is given by 

f{t,T)= lim R{t;T,T + AT) 

At^Q 

lnp(t,T + AT) -lnp(t,T) 

= — hm — 

At^o AT 

_ d\np{t,T) 
~ &f 

• The short rate is the instantaneous spot rate, and is defined by 

r{t) = f{t,t) 



158 



Modelling Fixed Income 



Given a tenor structure 

< i < To < Ti < r2 <••■< Tat 

we can find a forward swap rate St = S{t] Tq, Ti, . . . , T/v), the unique fixed rate, at time 
t, for which a fixed-for-floating forward swap, starting at time Tq, will have zero value. 
We clearly require, with Tj = Tj — Tj-i, that 

N N 

J2 StTjPit, Tj) = L{t; Tj^i,Tj)Tjpit, Tj) 



and thus 



^ ^ J2f=l L{t] Tj^i, Tj)Tjp{t, Tj) 
J2f=lTjPit,Tj) 

But 

N N 

J2L{t;Tj-i,Tj)Tjp{t,Tj) = J2-\p{t,Tj)-p{t,Tj^i)] =p{t,To)-p{t,Tr,) 
j=i j=i 

and hence 

p{t,To) - p{t,TN) 

^t = 



T.f=iTjPi^^,Tj) 



The denominator X^jLi TjPitj Tj) is sometimes referred to as the value of a basis point. 

Remarks 8.1.2 1. The assumption that there are traded zero coupon bonds of every matu- 
rity is clearly false. Nevertheless, a large number of implied zero coupon bond prices can 
usually be obtained by bootstrapping the yield curve. 

2. The instantaneous rates (forward- and short-) are theoretical entities, and not directly 
observable in the market. One of the shortcomings of short rate and HJM models is that 
they model these non-existent entities. Market models such as the BGM- and Jamshidian 
models, however, are concerned with the modelling of quoted market rates. 



□ 



The following lemma shows how bond prices are related to forward rates: 
Lemma 8.1.3 

p{t,T)=p(t, S)e-^s fii^^)du 
Proof: lnp(t, T) = lnp{t, S) + /J ^ll^^M du. 

As usual, we denote that money market account (MMA) process At by 

At = e-^o ^(") 

where r(t) is the short rate. 



□ 



Interest Rate Modelling 



159 



Example 8.1.4 No model that allows only parallel shifts of the yield curve is arbitrage-free. 
Proof: Suppose it is certain that /(1,T) = /(O, T) + e for all T > 1, where e is a random 
variable. Now choose times 1 < Ti < T2 < T3. At t = 1, 

Ki,Jj-e -e "p(0,l) 

Now suppose that we hold Xi T^-bonds (i = 1,2,3). We construct an arbitrage, a static 
portfolio satisfying 

(i) ELi^.p(0,T.) = 

(ii) Etia^iP(l,Ii) >Oa.s. 

At time 1 the value of the portfolio is 

3 



Fi(£) = 5^Xip(l,Ti; 



=1 

3 

Xi 7~ e 



p(0, Ti) „_£(Ti-i) 



p(0,l] 



where 



We shall ensure that Vi{e) > whenever e 7^ 0. First note that g{0) = 0, because 
Si=i ^iPi^j Ti) = 0. Further, Vi{e) and ^(e) always have the same sign, so to ensure V\{£) > 0, 
it suffices to ensure that g{e) > 0. 

Now 5 is a C^-function (twice differentiable) , and we require that (i) g{0) = 0, (ii) g{e) > 
whenever e 7^ 0. It follows that ^'(O) = 0, thus that 

3 

g'{0)=J2^iiT2-Ti)p{0,Ti)=0 

i=l 

and thus that 

3 

Y,x^Tip{0,Ti) = 

i=l 

Next, if wc ensure g"{£) > 0, then, combined with 5(0) = ^'(0) = 0, we se that g{£) > 
for aU e / 0. Now 

g"is) = ^x.(r2 - r,)2p(0,T,)e-(^'-^^) 

i=l 

and thus g"{£) > for all e if xi, X3 > (and at least one is > 0). 



160 



Modelling Fixed Income 



Now take X2 < 0. Since Yl^=i^iPi^^'^i) — 0' '^^ see that at least one of xi,X3 must be 
> 0. Since Yli=i^i{'^'^ ~ Ti)p{0,Ti)p(0,Ti) = 0, we see that xi,X3 have the same sign, i.e. 
both are > 0. Then g"{e) > for all e ^ 0, and hence also g{e) > 0. 

It follows that any portfolio {xi,X2,X3) satisfying 

(i) E■=lX^PiO,T^)=0 

(iii) X2<0 
is an arbitrage. 

□ 

Example 8.1.5 Define the long rate l{t) by 

l{t) = lim R{t,T) 

where R{t,T) is the c.c. spot rate, i.e. p{t,T) = e~^^^^''^\T — t). Though l(t) is not directly 
obtainable form traded securities (because the longest-term securities typically have a life 
of 30 years or so), it can be estimated, and empirical studies suggest that it fluctuates con- 
siderably over time. Most no-arbitrage models have a constant value for however, and 
indeed 

Theorem://" the term-structure dynamics are arbitrage-free, then l{t) is an increasing func- 
tion a.s. 

Proof: By rescaling time, we may assume that 1(1) < 1(0) with positive probability, to 
obtain a contradiction. For T = 1,2,3,..., construct a portfolio which, at t = invests 
^ — = into each of the bonds p{t, T), so that the value of then portfolio is = 

Et=i t{T+T) = 1- I^efine e = (/(O) - /(l))/3. Now p{0,T) = e-^(o,T)T^ r{0,T) /(O) 
as T ^ oo, so eventually, we have r(0,r) > /(0,r) - e, i.e. p{0,T) < e-('(°)~^)^ eventually. 
Similarly, p(l,r) > e"^'^^)^^)'^ eventually. Suppose these relations hold for all T > Tq. Then 

y_Y- Pi^^T) J^' pil,T) - e^^ 

1 2^^T{T + i)p{o,T) A.r(r + iMo,r) Z^^r(T + i) 

The second term diverges to oo, so that Vi = cc. Now since Vq = Eq[Vi/Bi], where Q is 
a risk-neutral measure and B is the bank account, we see that ^{Vi = oo) = 0, because 
Vq = 1 < oo. Since the "real-world" measure F is equivalent to Q, we must have F{Vi = 
oo) < P(/(l) > Z(0)) = as well. 



□ 



8.1.3 Modelling the Bond Market 

We consider three approaches: 



1. Specify short rate dynamics; 



Interest Rate Modelling 



161 



2. Specify bond price dynamics; 



3. Specify forward rate dynamics; 

Suppose, for example, that we are given the following dynamics: 
1. Short rate dynamics: 



dr{t, uj) = a{t, Lo) dt + b{t, u) dWt 



2. Bond price dynamics: 



dp{t, T){uj) = p{t, T){co)[m{t, T, oo) dt + v{t, T, cj) dWt] 



3. Forward rate dynamics: 



df{t,T){L0) 



a{t, T, w) dt + a{t, T, uj) dWt 



Here Wt is a standard (multidimensional) Brownian motion. 

If we're given one type of dynamics, can we deduce the others? If you think about this for 
a while, you'd expect that bond prices and short rates are deduceable from the forward rates, 
and that forward rates and the short rate are deduceable from the bond prices. A model 
of just the short rate seems to contain too little information to deduce all bond prices and 
forward rates however. 

Before we write down exactly how the various dynamics are related to each other, we need 
a stochastic Fubini Theorem and its corollary. 

Proposition 8.1.6 (Fubini's Theorem for Stochastic Integrals) 



where {s,io,S) $(5,5,0;) is V x B-measurahle (V = predictable a-algebra, B = Borel 
algebra), and 



The proof is omitted, but may be found in Durrett, Chapter 2, Section 11. 

Before we prove a corollary about the differentiation of stochastic integrals, it is convenient 
to gather well-known results about the differentiation of ordinary Lebesgue integrals: 

Proposition 8.1.7 Assuming sufficient smoothness and regularity, 





□ 




d_ 
dx 



Mx) ,-h(x) Q 

/ f{x,y)dy= —f{x,y)dy + f{x,h{x))—-f{x,g{x)) 




dg_ 
dx 



162 



Modelling Fixed Income 



Corollary 8.1.8 (Differentiation under tlie integral sign) 

d dv{s,T) 



dT 



f v{s,T)dWs= [ 
Jo Jo 



1^ dWs 

dT 



Proof: Just like the ordinary proof of differentiation under the integral sign: 



9 /•* d 
^ / v(s,T) dWs = ^ 
dT Jq ^ ' ' ^ dT 



JO 
t rT 



dWf, du 



dT Jo Jo dT 
d dv{s,u) 
dTJo Jo dT 
* dv{s, T) 



L 



dT 



dWs 



□ 



□ 



Consider now the various dynamics given above, i.e. short rate, bond price and forward 
rate dynamics. Assume that the drifts and variance rates are in the T-variable, and 
sufficiently regular to allow the interchange of order of integration. Further, assume that 
bond prices are bounded. 

The following theorem records the relationships between the various dynamics: 



Theorem 8.1.9 (a) If 



then 



where 



(b) If 



then 



where 



dp 
P 



m dt + v dWt 



df = adt + a dWt 

a{t, T) = VT{t, T) ■ v{t, T) - mrit, T) 
a{t,T) = -VT{t,T) 

df = adt + a dWt 

dr = adt + b dWt 

a{t) = fT{t,t)+a{t,t) 
h(t) = a(t,t) 



Interest Rate Modelling 



163 



(c) If 
then 

where 



dp 
P 



df = adt + a dWt 



r{t) + A{t,T) + -\\S{t,TW 



dt + S{t, T) dWt 



a{t, s) ds 



A{t,T) = - 

S{t,T) = -J^' a{t,s) 
Here \\ ■ \\ is just the usual Euclidean norm. 



ds 



Before we begin the proof, note that for each T we have a separate security p{t,T), i.e. 
for each T we have a separate process {p{t, T))t>o- It is to these processes that we apply Ito's 
formula, etc. 



Proof: (1) dlnp = [m — hv^] dt + v dWt and thus 



lnp(i,T) = lnp(0,r) + / m{s,T) -\v'^{s,T) dt + [ v{s,T) dWs 

Jo 2 7o 



so that 



rr.^ dlnp(t,T) dlnp(0,T) , 
-f{t,T) = 1^ = + J^rnT-VT-vds + vt dWs 

Taking differentials yields the result. 
(2) 



r{t) = f{t,t) = f{0,t) + [ a{s,t)ds+ [ a{s,t) dWg where 

Jo Jo 

a{s,t) = a{s, s) + J aT{s,u) 
a{s,t) = a{s,s) + J aT{s,u) 



du 
du 



and thus 



nt pt pz pt pt pt 

^(t) = f{0,t)+ / a{s,s)ds+ / / aT{s,u)duds+ / a{s,s)dWs+ / / aTis,u)dudWs 
Jo Jo Js Jo Jo Js 

pt pt pu 

aT{u,s)dsdu+ / a{s,s)dWs+ / / (Tt{u,. 
Jo Jo Jo 



f{0,t)+ / a{s,s) ds + 
Jo 



^0 



, s) dW,s du 



by the stochastic Fubini theorem. Thus 



dr{t) 



a{t,t)+ [ aTis,t)ds+ [ aT{s,t) dWg dt + a{t,t) dWt 
Jo Jo J 



= [a{t, t) + fT{t, t)] dt + a{t, t) dWt 



164 



Modelling Fixed Income 



as required 

(3) First define Y{t,T) = -J^f{t,s) ds, so that p{t,T) = e^(*''^). Now 



= /(0,s) + / a{u,s)du+ [ a{u,s)dWs 
Jo Jo 



and hence 



/•J ft pi 

Y{t, T) = - /(O, s)ds- / a{u, s) du ds - / a{u, s) dWu ds 
Jt Jt Jo Jt Jo 

fT ft pT ft fT 

= — /(O, s) ds — / a{u, s) ds du — / a(u, s) ds dW^ 
Jt Jo Jt Jo Jt 

= - r fiO,s)ds+ [' f{0,s)ds 
. Jo Jo 

/ / a{u, s) ds du+ / a{u, s) ds du 

Jo Ju Jo Ju 

na{u, s) ds dWu + / / <7(^) ds dWu 
Jo Ju 



rT rt 



+ 
+ 



y(0, r) - /" [ a{u, s)dsdu-- f j a{u, s) ds dWu 

Jo Ju Jo Ju 

' rt j-t rt rt rt 

+ / /(O, s) ds + / a{u, s) ds du+ / a{u, s) ds dWu 

Jo Jo Ju Jo Ju 

nT rt rT rt 

a{u, s)dsdu- / a{u, s) ds dW^ + / f{s, s) ds 
Jo Ju Jo 



Hence 



rt rt rT rt rl 

Y{t,T) = Y{0,T)+ r{s)ds- / a{u,s)dsdu- / a{u,s)dsdWu 

Jo Jo Ju Jo Ju 



t rT 



SO that 



dY{t, T) = 



1 r /-^ 

-{t) — / a{t, s) ds dt — / a{t, s) 

Jt J Ut 

[r{t) + A{t, T)] dt + S{t, T) dWt 



dWt 



and thus 



impUes 



as required. 



dp = d{e^) = e^[dY +-d[Y]] 



dP{t, T) 
p{t,T) 



r{t) + A{t,T) + -\\S{t,T)\\' 



dt + S{t,T) dWt 



□ 



Example 8.1.10 Synthetic Money Market Account 

In a bond market, subject to the conditions enumerated before, it is possible to syntheticahy 



Interest Rate Modelling 



165 



create a locally risk-free bank account. This is accomplished by rolling over just maturing 
bonds. 

Consider a portfolio V which, at any time, consists solely of bonds maturing at time t + dt. 
Suppose that there are nt such bonds in the portfolio, so that 



Vt = ntpit, t + dt) 



By the self-financing condition, 
dVt = nt dp{t, t + dt) 



= ntp{t, t + dt) 



r{t) + A{t,t + dt) + -\\S{t,t + dt)\\' 



dt + S{t, t + dt) dWt 



Now as dt — )■ 0, also A{t, t+dt) = — j/^"'* a{t, s) ds -> 0, and S{t, t+dt) = — j/"'''^* cT(t, s) ds — ^ 
0. Thus in the limit, 

dVt = r{t)Vt dt 

which are just the dynamics of the MM A. 

(Note, however, that the above argument is heuristic in nature: It requires, in any time 
interval, however short, the use of infinitely many types of securities.) 

□ 

In the riskneutral world, discounted bond price processes are martingales, and thus 

p{t, T) = Eq [e- '^'piT, T) \J^t 

In a Brownian world, any equivalent measure is obtained from the objective measure by a 
Girsanov transformation — a consequence of the Martingale Representation Theorem. If 



XudWt-y^\\u\\^dt 



cflP 



and if L{t) = Kp[-^\J^t] is the associated likelihood process, then dLt = utLt dWt- Now if 
p{t,T) = 2^|p, then (under Q) 

dp{t,T) = p{t,T)v{t,T) dWt 
(where Wt is a Q-Brownian motion), so that 

dp{t, T) = r{t)p{t, T) dt + p{t, T)v{t, T) dWt 
Hence under P, we have dynamics 

= [r{t) - uit)vit, T)] dt + vit, T) dWt 

i.e. in a Brownian world bond price dynamics are necessarily of the form dp = pm dt+pv dWt- 



166 



Modelling the Short Rate 



8.2 Modelling the Short Rate 

Short rate models are bond market models where the only explanatory variable is the short 
rate r. This was the earliest approach to bond market models, dating back to the paper by 
vasicek (1977), but short rate models have limited power. Nevertheless, principal component 
analysis shows that typically 80 - 90% of price variation in the bond market can be explained 
by a single factor, so these models are not wholly devoid of realism. 

When we specify only the short rate, the only exogenously given asset is the MMA A^. 
Zero coupon bonds will be regarded not as primitive securities, but as derivatives of the short 
rate. 

Question: Arc bond prices uniquely determined by the P-dynamics of the short rate? 

We assume that we live in a Brownian world governed by an objective probability measure 
P, with change driven by a (multidimensional) Brownian motion Wt- We further assume short 
rate dynamics of the form 

dr{t) = n{t, r) dt + a{t, r) dWt 

i.e. the short rate is an Ito diffusion. 

The answer to the above question is No! 

• The above bond market is clearly incomplete: 

— We are able to execute trading strategics which consist of putting all our money 
in the bank account only. This clearly doesn't give us enough freedom to replicate 
all possible J-y-measurable ransom variables. 

— There is at least one source of randomness, but there are no risky assets. 

— Under any measure , the discounted MMA ^ is a martingale, hence any measure 
equivalent to P, including P itself, is an EMM. The EMM is not unique. 

• If Q ~ P is any equivalent measure, then Q generates an arbitrage-free bond market 
with prices 



p(t,r) = Eq 



In the Black-Scholes model, we also had one source of uncertainty, but there option 
prices are determined by the dynamics of an underlying which is traded. The crucial 
difference here is that the underlying is the short rate, which is not a traded security. 

Nevertheless, bonds of different maturities must satisfy certain internal consistency con- 
ditions in order to exclude arbitrage. For example, if T\ < T2, then p[t, Ti) > pit, T2), or else 
there will be arbitrage (assuming positive rates). 

If we have d sources of noise (i.e. Wt is a d-dimensional Brownian motion), then we may 
pick d maturities, and regard the bonds of those maturities as "primitive" securities; bonds of 
all other maturities will be "derivative" . Our market now has as many risky primitive assets 
as securities, and is therefore complete. 

8.2.1 The Term Structure PDE 

Assume that we have an arbitrage-free bond market, with P-short rate dynamics given by 



dr{t) = ii{t, r) dt + a{t, r) dWt 



Interest Rate Modelling 



167 



where Wt is a one-dimensional P-Brownian motion. We restrict to one dimension purely for 
ease of exposition - similar results hold in the multidimensional case. 

Also assume that the price of a T-bond at time t is given by a sufficiently smooth and 
regular function F: 

p{t,T) = F{t,r{ty,T) = F'^{t,r) 

By taking two bonds of different maturities S and T, we are able to create a locally riskless 
portfolio. Arbitrage considerations then dictate that the drift of this portfolio is equal to the 
short rate. As usual, this yields a PDE, as we now show. 
First note that by Ito's formula 



dF'^ = 



Fl + fiF^ + ^a^F^, 



dt + gFJ dWt 



(where subscripts denote partial derivatives), so that 

dF^ rp rp 

dt + dWt 



FT 

where 

a^\t,r) = 



Fj.j. 



FT 



a {t,r) = ^ 



Consider now a portfolio V consisting of S- and T-bonds with relative weights w^, vF re- 
spectively. Then 

dV odF^ j.dF^ 

-Y = '^ :^+^ -w 

To eliminate risk, set w^a^ + w'^a^ = 0. Since weights add up to 1, we therefore obtain 

a a rp a 

w = FT w — 



— a'^ — a'^ 

Then 

dV _ a^a'^ - a^a^ 

i.e. 

= r 



(T^ — 
i.e. 

a^{t,r) — r a^{t,r) — r 

Now is just the drift of the bond price p{t,T) = F^ {t,r), and is its volatility. Thus 

a^it, r) — r 



= Market Price of Risk = A 



i.e. all bonds have the same market price of risk A = X{t,r). A is independent of maturity 
(though it may vary over time). 



168 



Modelling the Short Rate 



Proposition 8.2.1 (Term Structure PDE) 

In an arbitrage-free one-factor short rate model dr = fj, dt + a dWt there is a process X{t, r) 
such that 

a^{t, r) — r 



= Market Price of Risk = A 



Hence all bonds satisfy the following PDE 



Fi + (Ai - \a)F:: + -a'F^, -rF' =0 

F^{T,r) = l 



Proof: Since 



a (t,r) = 



FT 



uFj 



we have 



+ p.Fj + \a'^Fj, - rF^ 
aFT 



= X 



which can easily be manipulated to yield the term structue PDE. 



Using the Feynman Kac formula, we see that the bond prices are given by 



F' {t,r)=F{t,r;T)=^ 



- r{s) ds 



I.e. 



where 



dr = {fi — aX) ds + a dWg s > t 
r(t) = r 

are the dynamics of r under <Q^. Note that, since rt is a Markov process, we have 



E 



t,r 



Prom the fact that 



□ 



it follows that each is a risk-neutral measure (i.e. an EMM for the MMA). 

We can also get the risk-ncutral short rate dynamics from Girsanov's Theorem: a Girsanov 
transformation which effects the change of measure from real-world to risk-neutral has a 
Girsanov kernel equal to the negative of the market price of risk. Thus —aX is added to 
the drift when we change the measure. Each market price of risk process A gives a different 
risk-neutral measure Q"^. 

To summarize: 

• In an arbitrage-free short rate model, all bonds have the same market price of risk, 
regardless of maturity. 



Interest Rate Modelling 



169 



• Different market prices of risk yield different risk— neutral measures — The bond market 
is not complete. 

• The agents in the market will (implicitly) determine A and thus Q^. 
8.2.2 Martingale Models of the Short Rate 

We model the short rate directly under a fixed riskncutral measure Q. This is the EMM 
chosen by market participants, and should, in principle, be hidden in the term structure of 
bond prices. By calibrating a short rate model to bond prices, the market price of risk, and 
thus the market EMM, can be determined. This procedure is known as inverting the yield 
curve, and works as follows: 

(1) Choose a short rate model (Ho-Lee, Vasicek, Cox-IngersoU-Ross, Black-Derman-Toy) 
involving one or more parameters a = (ai, . . . ,0^). The Q-dynamics of the short rate 
are given by 

dr{t) = ii{t, r{t); a) dt + a{t, r{t); a)dWt 

(2) Solve the term structure PDE. In the risk-neutral world, the market price of risk is A = 0, 
and thus the PDE is 

F^ + f,F^ + ^a^F^,-rF^ = 
F'^{T,r) = 1 

for all maturities T. This yields theoretical bond prices 

p{t,T;a) = F'^{t,rt;a) 

(3) Go to the market, and "observe" the empirical term structure of bond prices {p*{0,T) : 
r> 0}. 

(4) Choose a so that the theoretical prices p{0,T;a) fit the empirical prices p*{0,T) "as 
closely as possible" (where "close" must be defined somehow. For example, one method 
would be to pick maturities Ti , . . . , r„ and to pick cci , . . . , so that 

n 

^(p(0,r;a)-/(0,T))2 
fc=i 

is minimized.) Let a* be this "best" parameter. 

(5) We now have dynamics 

dr{t) = nit,rit);a*) dt + a{t,r{t); a*) dWt 

under the risk-neutral measure. We can also, in principle, observe the real-world dynam- 
ics 

dr{t) = Jldt + a dWt 
Since fi = Jl — aX, we now know the market price of risk A, and thus Q = Q^. 



170 



Modelling the Short Rate 



(6) Ideally, we would like to have 

p{0,T-a*) =p*(0,T) for all T 

However, these are infinitely equations (one for each T), in only finitely many unknowns 
(the an)- This system is over-determined, and the model can not be made to fit 

the initial term structure of bond prices. 

(7) However, if we choose a to be an infinite dimensional vector, rather than a finite dimen- 
sional one, there may be sufficient room to fit the term structure exactly. For example, 
the Ho-Lee model is given by 

dr{t) = 9{t)dt + a dWt 

where a is a constant, and Wt a one-dimensional Brownian motion, here a = {9{t) : t > 0) 
is an infinite -dimensional vector. The Ho-Lee model can be fitted to the empirically 
observed term structure, but this is not obvious a priori. 

(8) Once we've parametrized our model,we can fit other interest rate derivatives. 
8.2.3 Common Short Rate Models 

The following are common short rate models with just one source of noise: 

• Vasicek: 

dr = {b- ar) dt + a dWt 

where a, b, a are constants. 

• Cox-Ingersoll-Ross: 

dr = {a — br) dt + cr\/r dWt 

where a, 6, a are constants. 

• Dothan or Rendlemann-Barter: 

dr = ar dt + ar dWf 

where a, a are constants. 

• Merton: 

dr = a dt + a dWt 

where a, a are constants. 

• Ho-Lee: 

dr = 9{t)dt + a dWt 

where a are constants. 

• Hull-White (extended Vasicek): 

dr = (bit) - a{t)r) dt + a{t) dWt 



Interest Rate Modelling 



171 



• Hull-White (extended CIR): 

dr = {b{t) - a{t)r) dt + a{t)^ dWt 

• Black-Derman-Toy: 



dr = a{t)r dt + a{t)r dWt 



• Black-Karasinski: 



dr = (o(t)r + b{t)r Inr) dt + a{t)r dWt 

All of the above can be written as 

dr = (ai(t) + a2{t)r + a3{t)rlnr) dt + {Pi{t) + /32{t)ry dWt 

8.2.4 Term Structure Derivatives 

Consider the general short rate model dr{t) = ii{t, r) dt+a{t, r) dWt- Suppose that an interest 
rate derivative has a terminal payoff ^{T,rT) and a dividend rate q{t,rt) over the interval 
[0, T]. The time~t price of the derivative is obtained via an arbitrage argument: Start with a 
portfolio V consisting of one derivative F and —n T-bonds p. Because of the dividends, we 
obtain 

dV = dF -ndp + qdt 
But choosing n = will make the portfolio locally riskless, and we obtain 

Ft + la^Frr -rF + q _ Pt + ^a^Prr - rP 

dr dr 

Now the term structure PDE states that 

1 



and thus 



Pt + ^a^Prr -rP = -{fi- aX)Pr 



Ft + ifi- aX)Fr + ^a^Frr -rF + q = 



F{t,rT) = HT,rT) 

This is the generalized term structure equation for an interest rate derivative F (where A = 
if we model the short rate in the risk-neutral world). 
The value of the interest rate derivative is clearly 

rT 



Fit,rt;T)=E^ 



e- ru rr) + ^' ^* "'"«(^' ^s) ds 



A trivial generalization of the Feynman-Kac argument shows that the solution of a PDE of the form 



is given by 



where 



Ft + M-Fx + -o-^F^a, -rF + h = Q F{T, x) = *(T, x) 



dXs = 1^ ds + a dWs for t < s < T and Xt = x 



172 



Modelling the Short Rate 



Example 8.2.2 (a) A call with strike K and expiry r on a discount bond p{t, T) (with 
T > t) has g = and $(r, r) = {p{t,T) — K)~^. To calculate the option price, we first 
have to solve the term structure PDE to get the bond prices, and then once more to price 
the option. 

(b) An interest swap (pay-fixed) can be idealized as a contract paying a divided rate h(t,rt) = 
rt—r*, where r* is the agreed-upon fixed rate (the swap rate at inception). Here $(t, rx) = 
0, and so 



I 



e 



(fs -r*) ds 



Now a floating rate note paying a continuous rate rt must be priced at par = 1 in order 
to avoid arbitrage (Why?), and thus 



g- ru du ^ g- // Tu du^^ 



1 



It follows that 



Er 



e-//'^«<^V, ds 



- ru du 



l-p{t,T) 



Hence the value of this idealized swap is 

F{t,rt) = l-pit,T)-r* j\it,s) 



ds 



The swap rate at time t for maturity T sets the value of the swap to zero, and is 

l-p{t,T) 



r*{t,T) 



It P{t, ds 



(c) A cap can be idealized as a derivative with zero terminal payoff and a dividend rate 
q{t, rt) = {rt — r)"*", where r is the cap rate. 



□ 



We will spend quite a bit of effort pricing options on discount bonds in the next few pages. 
But what about coupon -bearing bonds, which are, after all, more commonly traded in the 
market? Jamshidian's Trick sometimes holds the answer: In a short rate model, a call on a 
coupon bearing bond can be priced as a portfolio of calls on zero coupon bonds, provided 
that the value p{t,T) = p(t,rt;T) of the zero coupon bonds is a strictly decreasing function 
of the short rate. 

Theorem 8.2.3 Let C^''^{t, rt) be the time-t value of a call on a coupon bond B, where r is 
the expiry of the call, and K the strike. Suppose that the coupon bond pays a coupon Yi at 
date Ti, where t <Ti < ■ ■■ <Tn. Let 

K'=p{T,r*,Ti) 



where r* solves 



B{T,r*) = K 



Interest Rate Modelling 



173 



Recall that B(r, r) = '^■Yip{T,r,Ti). Since each p{t,r,Ti) is a decreasing function of r, so is B{t,r), which 
implies that r* is unique. Solve numerically for r* , e.g. via bisection method. Then 

C^'^{t, r) = J2 YiC^''^''^'{t, r) 

i 

where C^'"^'*^ is the time-t value of a strike K, expiry T call on a zero coupon bond p(t, S) 
(with S>T). 

Proof: The payoff of the call on B is 



(H(r, rr) - K)+ = Yip{T, r,, T,) - 



Since each p{t,r,T) is decreasing in r, so is l3{T,rr)- Let r* be the unique value of for 
which the call C expires at the money, i.e. for which 

B{r,r*) = K 

Now define Ki = p{t, r*,Ti). Then 



YYiKi = K 



Now consider two cases: 
Case 1: If rr < r*, then 



J2 YiPir, rr, Ti) > ^ F,p(r, r*,T,) = K 

i i 

and 

p(T,rr,Ti)>p{Ty,Ti) = Ki 
Thus if C^'"^ expires in the money, then so does each C^"'^''^% and 

^ Yip{T, rr, Ti) -K^ =J2 YiPir, Vr, Ti) - K 

= Y,Yi(jp{T,rr,Ti)-Ki) 

i 

= Y,Yi{piT,rr,Ti)-Ki)+ 

i 

Case 2: If > r*, then Y.iYiP{T,rr,Ti) < K and p{T,rr,Ti) < Ki. Thus if C^-^ expires 
out of the money, then so does each C'^i'"^'^*, so 



0= ^Yip{T,rr,Ti)-K^ 
= J2Yi{p{r,rr,Ti)-Ki)^ 



174 



Affine Term Structure Models 



Hence in either case 



C'''^T,rr) = Y,yiC''-"'^'{r,rr 



Thus, by the law of one price, 



C^'%rt) = Y,Y,C^-''^^{t,rt) 



for alH < T as well. 



8.2.5 Lognormal Models 

The Dothan, Rendleman-Barter, Black- Dcrman -Toy and Black-Karasinski all yield lognor- 
mal short rate dynamics. All suffer from the following problem: Let At denote the money 
market account, with dAt = rfAt dt, Aq = 1. Then 

for sufficiently small t. Now, since rj is lognormal, define = Inrj. Then we have an 
expectation of the form 



E 



62^ 



E 



for some normally distributed Z. Now 

/oo 
-<x 



e 



-^V2 = 



OO 



as e^^ >> e~^^/^ for reasonable values of z. Hence E[A(f)] = oo even if t is small, i.e. the 
bank account, on average, explodes. 
Indeed, it can be shown that 



E 



1 



oo for alH > 



One consequence of this lognormal explosion is that one cannot price Eurodollar futures. 



8.3 AfRne Term Structure Models 
8.3.1 Mechanics of ATS models 

Definition 8.3.1 A short rate model is said to possess affine term structure (ATS) if bond 
prices are given by 

p{t,T) = F^{t,r{t)) = e^(t.r)-B(t,T).(t) 
where A{t,T), B{t,T) are (sufficiently regular) deterministic functions. 



□ 



Interest Rate Modelling 



175 



Note that not all short rate models are affine term structure models. However, the class of 
affine term structure models is quite well understood: They are those for which both the drift 
and the volatility-squared are affine functions of the short rate. 

For consider a short rate model with risk-neutral dynamics dr{t) = iJ,{t, r) dt + a{t, r) dWt 
and suppose that bond prices arc of the form p{t,T) = F'^{t,r(t)) = e'^^*'"^^"^^*'"^-'^^*). Sub- 
stituting this expression into the term structure PDE 



we obtain 



At- ijlB + l-a'^B'^ 



F^{T,r) = l 



[1 + Bt)r = 



Moreover, since p{T, T) = 1, we must have A(T, T) = = B[T, T). 

If we assume that the drift and volatility of the short rate can be expressed in the form 

li{t,r) = a{t)r + f3{t) 
a^{t,r) =-f{t)r + S{t) 



then we obtain 



At-PB + ^5B^ 



1 + Bt + aB- -^B' 



The lefthand side is independent of r, whereas the righthand side contains r. This can happen 
only if both sides are identically zero, so that we obtain a coupled system of differential 
equations: 

' At{t,T) = mB{t,T) - l6{t)B\t,T) 



ait)B{t,T) + ^j{t)B\t,T)-l 



Note that the bottom equation (a Riccatti equation) does not contain A, and can therefore 
be solved (in principle, although this may be quite hard). The solution can then be plugged 
into the top equation to solve for A. To solve this equation, simply integrate both sides (from 
t to T). 

Thus a short rate model has affine term structure whenever n, a are of the form //(t, r) = 
a{t)r + j5{t) and a^{t,r) = ^(t)r + d{t). The Ho-Lee, Cox-Inger soil-Ross, Merton, Vasicek 
and Hull-White models all have ATS. The Dothan and Black-Derman-Toy models do not. 




Example 8.3.2 The Vasicek Model 

Here we have dvt = {b — ar) dt + a dWt, where a,b,a are constants. Thus we have a = 
—a, /5 = 6, 7 = 0, (5 = (7^, all constant. 

The system of differential equations that must be solved is therefore 



At 

[A{T,T) = 



bB - -a^B'^ 
2 



176 



Affine Term Structure Models 



J Bt = aB-l 
\b{T,T) = 

The bottom equation is a first order linear equation. This can easily be solved: Use e~"* as 
an integrating factor to obtain 



B{t,T) = e 



at 



1 



and then use B{T,T) = to get C{T) = -h^"'^. Hence 



B{t,T) = - 



a I 



1 _ e-»(^-*) 



Plug this into the equation for A to obtain 



At = bB{t,T)-^a^B^{t,T) 



1 _ e-a(T-t) 



1 _ e-a{T-t) 



^A{T,T) = 
Integrate both sides: 



A{t,T) = A{T,T) - £ At{s,T) 



ds 



-a 



^ 1 r ,^ .12 



1 _ e-'^iT-s) 



ds — b 



Jt « 



1 _ e-«(^--) 



ds 



a^B' ^ {B-{T-t)){ah-\a^) 



4a 

Now that A, B have been found, bond prices are given by the equation p{t, T) = e"^(*'-^)~^(*'-^)''(*) . 

□ 



In order to invert the yield curve in the above example, the parameters a, ba must now be 
chosen so that the model fits empirical (observed) term structure of bond prices {p*{0,T) : 
T > 0} as "closely" as possible. Clearly, however, we have infinitely many bond prices, but 
only three parameters, i.e. the system is highly over-determined, and therefore we cannot 
generally choose a,b,a such that cA^'^''^^~^^'^''^^'^° = p*(0, T), i.e. the model cannot be made 
to fit the observed term structure exactly (unless we are astoundingly fortunate). The Vasicek 
model is able to fit, exactly, just 3 bonds. 

Example 8.3.3 Cox-Ingersoll-Ross model The risk-neutral short rate dynamics assumed 
are 

drt = (b — art) dt + a^/rt dWt, a., b, d, rg > 



This is mean reverting (to b/a). Since the volatility term o^prl tends to zero as — t- 
(which is consistent with observation) , positive rates are assured (which is also consistent with 



Interest Rate Modelling 



177 



observation). Postulating p{t, T) = e"^^*'-^) B(t,T)rt ^ quickly determine, by substituting into 
the term structure PDE, that 

Bt = aB + ^(T^B^ - 1 B{T, T) = 
At = bB A{T, T) = 

To solve the Riccati equation for B, we try a solution of the form 

X{t) 



B{t,T) 



cX{t) + d 
Xt cXXt 



Then 

^* ^ cx + d " {cx + ay 

and hence, substituting into the equation for Bt, we see that 

-dXt + X2(ac + ^£7^ - c^) + X{ad - 2cd) - d^ = X{T) = 

Choose c to ensure that a + ^a^ — = 0, i.e. c = ^(a + Va^ + 2(7^). We then have a order 
linear differential equation 

Xt + kX = —d where k = —a + 2c = ^/c^~+~2a^ 

Since -'^(T') = 0,we see that 

d, 



X(t) = -[e«M-l] 

K 



Hence 



B{t,T) 



cX{t) + d 

^K{T-t) _ I 

" \{K + a){e<T-t) -1) + ^ 

_ 2(e"('^-^) - 1) 

~ 2k + (a + K)(e«('^-*) - 1) 

Then A{t, T) is obtained by integrating: 



where k = 



The solution is 



A{t, T) = A{T, T)- [ At{s, T) ds = -b [ B{s, T) ds 
Jt Jt 



A{t,T) = ^In 



2k+ (a + K)(e«(^-*) - 1) 



as can be verified by differentiation. 



□ 



178 



Affine Term Structure Models 



Example 8.3.4 Ho-Lee Model 

We are given risk-neutral short rate dynamics dr{t) = 6{t) dt + a dWt, where 6{t) is deter- 
ministic and a a constant. The model has ATS with a = 0,(5 = 6,^ = 0,S = a^. This leads 
to two differential equations. The first is 

f Bt = -1 
\B{T,T) = 

which has solution B(t,T) = T — t (as can be seen by integrating both sides from t ioT). 
The second DE is 

At = 0{t)B{t,T)-\a-'B''{t,T) 



e{t){T - t) - ^a\T - tf 



[^(T,r) = o 

Integrating both sides from t to T yields 

f-T 



A{t,T) = -j^ 9{s){T-s)ds+^a^{T-tf 



We now choose the function 9{t) so as to fit the initial term structure of bond prices {p* (0, T) : 
r > 0}, or, equivalently, the observed term structure of (instantaneous) forward rates 
{/*(0,T) :r>0}. 

Recall that /*(0,r) = with affine term structure, we have p*{0,T) = 

gA(0,T)-S(0,T)ro^ gO 



in/(o,r) = - r e{s){T- 

Jo 



s) ds + -a^T^ - roT 



Differentiating with respect to T, we see that 

r(o 



),r)= / e{ 

Jo 



s) ds - -^(y^T'^ + ro 



Differentiating once more with respect to T, we obtain 



9r(0,T) 
dT 



e{T) - a'^T 



and thus we have found 9: 



9{t) = f^iO,t) + aH 



We can use this to calculate A{t,T): 
rT 



A{t, T) = imO, s) + a''s){s -T)ds + ^a^{T - t) 
= r (0, s){s-T)\f-£ r (0, s) ds + a 
= r{Q,t)iT-t) + 



J t 



+ -^a\T-tf 



= nO,t){T-t) + ln 



t dT 2 ^ ' 

p*{0,T)\ 1 



p*{0,t) 



aH{T - tf 



Interest Rate Modelling 



179 



Using the fact that the Ho-Lee model has ATS, we see that bond prices are given by 

= "''p (^*^°' ^^^^ " " r'^^^ ~ 

where ^ ^ 

Jo Jo 
= ro + /*(0,t) - r (0, 0) + '^ah^ + aW* 

because ro = /*(0, 0). It follows that E[rj] — )■ oo (under the riskneutral measure). This is 
clearly a flaw in the model. 

Since the short rate is Gaussian, future bond prices are lognormally distributed under the 
risk-neutral measure. In particular, there is a non-zero probability that a bond will, at some 
future date, trade above par (i.e. that interest rates become negative). This is clearly another 
flaw in the model. 

Now that we've calculated the evolution of future bond prices and rates, let's have a look 
at future forward rates. Since f{t,T) = — we see that 

fit, T) = /* (0, T) - /* {0,t) + aH{T -t) + rt 
= r{0,T) + aH{T-^t) + aWt 

using the expression for rt obtained earlier. Note that f{t,t) = rt- 

Now if we fix t > 0, we see that E[/(t, T)] — t- oo as T — t- oo. Indeed, for large values of T, 
/(i, r) « kT. Thus even if the initial forward curve is bounded above, it will be unbounded 
an instant later. This is another flaw in the Ho-Lee model. 

□ 

Example 8.3.5 The Hull-White (extended Vasicek) Model 
Consider the short rate model with risk-neutral dynamics 

drt = {h{t) - art) dt + a dWt 

where b{t) is deterministic, a, a arc constants and Wt is a one-dimensional Brownian motion. 
This is clearly an affine term structure model drt = (o(i)'^t + /^(^)) dt + ■\/l{i)'>'t + ^if) dWt, 
with a{t) = -a,P{t) = b,-f{t) = and S{t) = a^. Substituting p{t,T) = e^(*.^)--B{*,7')n j^^q 
the term structure PDE yields 

Bt{t,T) = aBit,T) - 1 
B{T, T) = 

and 

At{t,T) = b{t)B{t,T)-^a^B^{t,T) 
A{T, T) = 



180 



Affine Term Structure Models 



Hence 

B{t,T) = -(l-e-"(^-*)) 

fT 1 

A{t, T) = / -b{u)B{u, T) + -c7^5^(n, T) du 
Jt 2 

Fitting the initial term structure of bond prices is equivalent to fitting the initial term struc- 



ture of forward rates. The latter is more convenient. Now since /(O, T) = 



dlnP{0,T) 

At{0,T) + BT{0,T)ro, and since BT{t,T) = e-^C^"*), we observe 

T 



/(O, T)= f b{u)BT{u, T) + a'^Briu, T)B{u, T) du + St(0, T)ro 

= r 6(t.)e-»(^-") du-^{l- e-«(^-*))2 + e-»^ro 
Jo 2a 

We side-step the computation of these integrals using the following trick: Define 

x{t) = e-"Vo + /* 6(7/)e-'^(*-") du 
Jo 



Note that 



x'{T) = -aroe""'^ + b{T) -a 6(M)e-"(^-") du 

Jo 

= -ax{T) + h{T) 
Now /(O, T) = x{T) - y{T), and so 

6(r) = x' + ax 

= fTiO,T)+y'{T) + ax{T) 

= MO, T) + y'{T) = a[/(0, T) + y(r)] 



Thus, noting that y{t) = = la'^B'^{0,t) and thus that y'{t) = ^(l-e-"*)e-"* = 

a^B{0,t)BT{0,t), we obtain 

b{t) = t) + (7^5(0, t)BT{0, t) + a[r (0, + \a''B\0, t)\ 
= /^(0,0 + ar(0,i) + ^[l-e-2«*] 

This is the function b{t) which will fit forward rates to the observed term structure {/*(0, T) : 
r> 0}. 

Since we now know b{t) we can calculate A{t,T): 

A{t,T)= I -b{u)B{u,T) + \a'^B'^{u,T) du 
Jt 2 

Now note that h{u) = x'{u) + ax{u) = e~"" so that 



Interest Rate Modelling 



181 



b{u)B{u,T) du=- e-""^^^(l - e-"(^-")) du 
1 r'^ 

--x(t)(l-e-'^('^-*))+ / 

B{t,T) + J^ 



f{0,t) + -B\0,t) 



{0,t)B{t,T)+ / — S2(0,n)d« 



Hence 



B^{u, T) - B^{0, u) du + B^{0, t)B{t, T) 



A{t, T) = /(O, t)B{t, T) + In + ^ 

Now, after a few lines of manipulation, 

/ B'^{u,T) - B^{0,u) du + B^{0,t)B{t,T) = -^B^{t,T){l - e 
as you can easily check, substituting B(t,T) = ^(1 — e~"(-^~*)). Thus 

Ait, T) = m t)B{t, T) + In - "^^B^t, T){1 - e'^"*) 

Substituting p{t, T) = e^{t,T)-B{t,T)rt obtain: 



-2at\ 



□ 



We have thus found the following bond prices: 



Theorem 8.3.6 (a) In the Ho-Lee model, bond prices (fitted to the initial term structure) 
are given by 

p{0,T) 



p{t,T) 



piO,t) 



1 



exp /(O, t){T -t)- -aH{T - tf - (T - t)r{t) 



(b) In the Hull-White (extended Vasicek) model, bond prices (fitted to the initial term struc- 
ture) are given by 



p{t,T) 



p(o,r) 



p{0,t) 

where B{t,T) = ^(1 - e^^^^-*)) 



exp ( /(O, t)B{t, T) - ^B\t, r)(l - e-2«*) - B{t, T)rt 



□ 



182 



Affine Term Structure Models 



8.3.2 Bond Options 

In the chapter on changes of numeraire, we obtained the foUowing general option formula: 
The price of a call C with strike K and maturity T on an underlying S is given by 

Co = SoQs{St >K)- Kp{0,T)Q^{St > K) 

where Q5,Q^ are the EMM's associated with numeraires St,p{t,T) respectively. 

In order to use this formula, and to get Black-Scholes type solutions to option pricing 
problems, we assumed that the volatility of the securities is deterministic, and then obtained 



Theorem 8.3.7 If St = ^^(fr) process of the form ^ = fj,{t) dt + a{t) ■ dWt, and if 

a{t) is deterministic, then the value of a call C with maturity strike K and T on underlying 
security S is given by 

Co = SoN{d,)-Kp{0,T)N{d2) 



where 



yav = ^\j^MtWdt 

d\ = -j= d2 =di-aavVi 

CTavVT 



Put-call parity yields 



Po = -SoN{-di) + Kp{0,T)N{-d2) 
for the price of a corresponding put. 

□ 

We can now use this theorem to price bond options. 
Example 8.3.8 Bond Options in the Ho-Lee Model 

Consider a European call option C with strike K and maturity T on a discount bond p(t, S) 
(where S > T). In the Ho-Lee model, with risk-neutral dynamics drt = 0{t) dt + a dWt, 
bond prices have dynamics 

^^^^ =rdt-a{T- t) dWt 
p{t,T) ^ > 

The drift term is r, because bond prices have drift r under the risk-neutral measure, just 
like all other traded securities. The volatility is obtained from the affine term structure: 
p{t,T) = e^(*'^)--B{*,T')rt^ fQ^j^^j ^Yi&i B{t,T) = T -t (and we don't care about the 

value of A{t, T) right now.) Thus the bond volatilities are deterministic: p{t, T) has volatility 
—a{T — t) and p{t, S) has volatility —a{S — t). Now the underlying security is p{t, S), and 
p{t, S) = has deterministic (indeed, constant) volatility —a{S—t)+a{T—t) = —a{S—T). 
This is because the volatility of a ratio of two assets is just the difference of their volatilities. 
It follows that p{t,S) is lognormally distributed, and that lnp{t,S) has variance al^T = 
a^{T - Sf dt = a'^{S- TfT. It follows that the price of the call is 



Co = p{0, S)N{di) - Kp{0, T)Nid2) 



Interest Rate Modelling 



183 



where 

cti = ^ 

a{S-T)Vf 

d2 = di- a{S - T)Vf 

□ 

Example 8.3.9 Bond Options in the Hull-White (extended Vasicek) Model 

Wc tackle once more the problem of pricing a call with strike K and maturity T on a zero 
coupon bond p{0, S), where S' > T. It ought to be clear from the analysis of bond options in 
the Ho-Lee model that we need mainly to find the volatility of the bonds p{t,T). Now, as 
for the Ho-Lee model, the riskneutral dynamics of p{t, T) are 

^^j^ = r dt - Bit,T)a dWt 

so that the volatility of j?(t,r) is - e-''^^-*)). The asset ratio pt = p{t, S)/p{t,T) 

therefore has volatility f (e~"'^ — e'~"'^)e"* at time t. Thus the average volatility-squared is 

We now find that the value of the call is simply 

p{{),S)N{di) - Kp{{),T)N{d2) 

where 

, Kp{0,T) ^ 2^av-L , , 

di = ^ — '—^ d2 = di- aavVT 

aavVT 

□ 



8.4 The Heath— J arrow— Morton Framework 
8.4.1 The Set-Up 

Up till now, we have studied interest rate models in which the short rate is the only explana- 
tory variable. Such an approach has many obvious advantages: 

• Specifying r as the solution of an SDE allows us to use Markov theory, which leads to 
PDE's (e.g., via the Feynman-Kac theorem, or the Kolmogorov forward and backward 
equations) that can be solved; 

• If we're lucky, we can obtain analytical formulas for bond prices and bond option prices 
(as we did for the Ho-Lee and Hull-White (extended Vasicek) models. 

However, the short rate modelling approach has some obvious disadvantages as well: 

• It is unreasonable to regard the short rate as the only explanatory variable — it is 
difficult to incorporate views about different times in the future; 



184 



The Heath- J arrow-Morton Framework 



• It can be quite difHcult to fit a realistic volatility structure; 

• In order for the model to have even a remote chance of being correct, it is necessary to 
invert the yield curve (i.e. to fit the model to the initial term structure of bond prices). 
This can be quite difficult as well. 

The Hcath-Jarrow-Morton (HJM) approach circumvents some of these difficulties by 
specifying dynamics for the entire (uncountable) family of forward rates. For a fixed T > 0, 
assume that the forward rate f{t,T) has "real-world" dynamics 

df{t, T) = a{t, T) dt + a{t, T) dWt T >0,0 <t <T 

where Wt is a finite-dimensional Brownian motion under the real world measure P, and 
a{t, T) and a{t, T) are adapted (and sufficiently regular to ensure that most of the operations 
below are permissible. For example, it is often necessary to assume that a(t, T) is jointly 
measurable in the t- and T-variables.) 

Thus wc have infinitely many SDK's, one for each maturity T. Each such SDE has an 
initial condition, namely /(O, T) = /*(0,T), where /*(0,T) is the observed term structure, 
the advantage of this approach is that the initial term structure is fitted automatically — it is 
an initial condition! — so that inverting the yield curve becomes unnecessary. It is also easier 
to incorporate views about different maturities, because wc have many different SDE's. (The 
disadvantage, of course, is that we have many, many SDE's.) These are still manageable, 
because we assume that the bond market is driven by finitely many sources of noise. But this 
leads to another difficulty: 

Remarks 8.4.1 Given a{t, T),a{t, T) and {/*(0, T):T> 0}, we can solve the SDE's for the 
forward rate, so that we have specified the entire term structure {f*{t,T) : T > 0,0 < t < T} 
at all times and all maturities, and thus the entire term structure of bond prices 

p{t,T) = e-^t }it^-^)du 

Since we have only finitely many sources of noise, and infinitely many traded assets, there is 
a possibility of arbitrage in the bond market, unless the bond prices are inter-related in a 
specific way (which amounts to all bond prices having the same market price of risk, for all 
source of noise). This will impose conditions on the functions a and a. 

□ 

Remarks 8.4.2 HJM is not a model, but a framework of models for the bond market; short 
rate models are another such framework. But whereas short rate models are generally Ito 
diffusions, and thus Markov processes, we can easily let a and a depend on past history. HJM 

models therefore need not be Markov models. (Of course, short rate models do not really need 
to be Markov either, but then their dynamics cannot be given by diffusions. We shall discuss 
the relationship between short rate and HJM models in the next section.) 

□ 

For a market model (driven by Brownian motions) with only finitely many securities, we know 
that the model is arbitrage-free if and only if we can construct a risk-neutral measure, and 
complete if that measure is unique. Equivalently, the market is complete if an only if there 



Interest Rate Modelling 



185 



are as many traded risky securities as Brownian motions, subject to some conditions which 
ensure that the traded securities, are, in some sense , independent (where "independent" 
is meant in the sense of hnear algebra, and not probabihty). The fact that there are only 
finitely many sources of noise, but infinitely many traded assets, means that the market is 
"over-complete", i.e. that there may be many ways of replicating a security. Unless all such 
replicating portfolios have the same price, there will be arbitrage. In practice, all securities 
must have the same market price of risk. If that's the case, we can construct a riskneutral 
measure (via a Girsanov transformation), which implies that the market is arbitrage-free. 
The arbitrage theory we've developed thus far only applies to markets with just finitely 
many traded securities, and it isn't at all clear that the impossibility of arbitrage implies 
the existence of a riskneutral measure (i.e. a measure under which all uncountably many 
zero coupon bond prices, when discounted, become martingales). We can, however, construct 
riskneutral measures for any finite subset of zero coupon bonds. Nevertheless, it is highly 
desirable to have a single riskneutral measure for all bonds simultaneously (because prices 
of securities are then just expected discounted payoffs, where the expectation is taken w.r.t. 
the riskneutral measure). We will therefore try to impose a strong form of the no-arbitrage 
condition: The existence of a riskneutral measure for all bonds. 

To enable us to construct such a riskneutral measure, there must be relationships between 
a(t, T) and a{t, T) that must hold if the HJM model is to be arbitrage — free: 

Proposition 8.4.3 Assume that the bond market is arbitrage- free in the strong sense, i.e. 
assume that there is a risk-neutral measure for bonds of all maturities. Then there is a 
(multidimensional) process \{t) such that, for all maturities T, 



If we use a Girsanov transformation with kernel —A to change to a new measure Q, then new 
dynamics of p{t, T) are 




Proof: Recall that 



dp{t,T) 
p{t,T) 



= r{t) + A{t,T) + ^\\Sit,T)\\' dt + S{t,T)dWt 



where Wt is a P-Brownian motion. Here 




dp{t,T) 
p{t,T) 



r{t) + A{t, T) + ^ \\S{t, T)| p - S{t, T)X{t) dt + S{t, T) dWt 



where Wt is a Q-Brownian motion. For Q to be a riskneutral measure, each p{t, T) must have 
drift r(t), i.e. 

A{t,T) + \\\S{t,T)f-a{t,T)\{t) = G 



186 



The Heath- J arrow-Morton Framework 



This shows that A is just the market price of risk of T) at time t, for all T: All bonds 
have the same market price of risk. 

Differentiating this equation with respect to T yields 

-a(t, T) + a(t, T) a{t, s)*'^ ds + a{t, T)X{t) = 

□ 

Suppose that we have an HJM model driven by d sources of noise, so that each a{t, T) is a 
d-dimensional row vector a = (cri, . . . , Ud), and A = (Ai, . . . , A,^)*'' is a d-dimensional column 
vector. We then have 

d d 

a(t,r) = ^a,(t,T) / ai{t,s)ds + Y,<t,T)Xi{t) (*) 
1=1 •'^ i=i 

If we take a and a as given, we can try and solve for A. We then have uncountably many 
equations in just d unknowns Ai(t), . . . , \d{t) — one equation for each T. Thus a, a cannot 
be specified arbitrarily. What we can do is 

• Specify the volatility surface a{t,T). 

• Choose d benchmark maturities Ti, . . . , and specify a{t, Ti), . . . , a{t, To). 

• Solve the system (*) of d equations for the d unknowns Ai(t), . . . , Xd{t)- 
All the other a{t, T) (for T ^ a bench mark maturity) are now given by (*). 

8.4.2 Martingale Modelling 

As for short rate models, it is often convenient to bypass the necessity of estimating the 
market price of risk, and to model directly under the risk-neutral measure Q. i.e. we write 

df{t,T) = a{t,T)dt + a{t,T)dWt T > 0,0 < t < Tf{0,T) =/*(0,T) 

where Wt is a Q-Brownian motion. Under Q, the market price of risk is A = 0, so we obtain: 

Proposition 8.4.4 (HJM Drift Conditions) 

The riskneutral dynamics of forward rates satisfy the following conditions: 

a{t,T) = '^(t^T) (Tit,sY'' ds 

□ 

Thus in the riskneutral world, the drifts a{t, T) are completely determined by the volatility 
surface a{t,T). To create an HJM model, therefore, just follow the following steps: 

• Estimate (or otherwise specify) a volatility surface a{t,T). 

• Calculate the drifts a{t, T) = a{t, T) a{t, s)**" ds. 



Interest Rate Modelling 



187 



• Observe the term structure of forward rates {/*(0, T) : T > 0}. This involves building 
a yield curve for all maturities. 

• Integrate: 

f{t,T) = f*{0,T)+ [ a{u,T)du+ f a{u,T) dWu 
Jo Jo 

• Compute bond prices p{t, T) = e~ '^^ and the prices of other interest rate deriva- 
tives. 

8.4.3 Examples and Applications 

Example 8.4.5 We consider here the simplest possible HJM model: We have only one source 
of noise, and put a{t, T) = a = constant for all t, T. By the HJM drift conditions, we see 
that 

a{t, ^) = <^ y ads = a'^{T - t) 
under the riskneutral measure. Hence the riskneutral dynamics of forward rates are 

df{t, T) = a^{T-t)dt + a dWt 

/(o,r) = r(o,r) 

Integrate this to obtain 

f{t,T) = f*{0,T) + aHiT - + aWt so that 
r{t) = riO,t) + ^aH^ + aWt 
and thus the short rate dynamics are given by 

drt= [fT{0,t) + (T^f] di + adWt 

These short rate dynamics should be familiar: We've obtained the Ho-Lee model fitted to 
the initial term structure! Note that we didn't have to do the actual fitting — in the HJM 
framework, fitting is automatic. 

□ 

Thus the Ho-Lee model is (equivalent to) the simplest HJM model. 

Example 8.4.6 Can the Hull-White (extended Vasicek) model be recast in the HJM frame- 
work? 

Indeed it can. The Hull- White model drt = (bit) — art) dt + a dWt is an affine term structure 
model, with bond prices p{t,T) = eA{t,T)-B{t,T)rt _ jjence f{t,T) = -AT{t,T) + BT{t,T)rt, 
which means 

df{t, T) = [■] dt + Brit, T)a dWt 

where we haven't bothered to calculate the coefficient of the dt-term (which is, of course, just 
a{t, T)). But for the Hull-White model, it was easy to calculate B{t, T) = \[l- e-^C^"*)], so 
that BT{t,T) = e-"(^-*). It follows that 

df{t, T) = a{t, T) dt + ae-"(^-*) dWt 



188 



The Heath- J arrow-Morton Framework 



Thus a(t,T) = fTe-"(^-*) We can now use the HJM drift conditions to calculate a{t,T) = 
a{t,T)f^a{t,s) ds = ^[e'-(T-t) - e-MT-t)^, 

To verify that the above model leads to the Hull-White model, recall that the short rate 
dynamics can be deduced from the forward rate dynamics as follows: 

drt = [frit, t) + a{t, t)] dt + a{t, t) dWt 

Now a{t, t) = a, and a{t, t) = 0. Finally, 

/(t,r) = /(0,r)+ fa{u,T)du+ f a{u,T)dWu 
Jo Jo 

which implies that 

r{t) = e{t) + I ae-''(*-") dWu 
Jo 

for some function &{t), and hence that 

dr{t) = e'{t) dt - (^aj^ ae-'^^*-") dW^^ dt + adWt 

= [e'{t) - a{r{t) - e{t))] dt + a dWt 
= [h{t) - ar{t)] dt + a dWt 

Moreover, b{t) = e'{t) + a@{t), and e{t) = /(0,i) + /q du = f{0,t) + ^(1 - e-"*)^. 

This is exactly the value of b{t) which we obtained for the Hull-White model fitted to the 
initial term structure. 

□ 

Remarks 8.4.7 The above example suggests a simple mechanism for turning a fitted afHne 
term structure model drt = dt + ar dWt into an HJM model: 

• lip{t,T) = e^(*.^)-^(*.T)n^ solve the (Riccatti) ODE for B{t,T). 

• Then the HJM volatility surface is given by a{t,T) = BT{t,T)ar dWt. 

• The HJM drift conditions now specify a{t,T) as well. 

□ 

Example 8.4.8 We consider a model with two sources of noise W^jW^ and a volatility 
surface 

a(t,r) = (c7i,c72e-«M) 
where ai,a2,a are positive constants. The HJM drift conditions dictate that 



2 

a{t,T) = af{T-t) + ^ 

a 

Integrating the forward rate dynamics, we see 

,2 



^-a(T-t) _ ^-2a{T-t) 



fit, T) = /(O, T) + alt (^T - + ^ [2e-"^(l - e'^*) - e-2«^(l - e"'')] 

+ aiWl + (72 f e-«(^-") dWl 
Jo 



Interest Rate Modelling 



189 



Thus 



2a2 

ft 



+ (71 + a2 f e-«(*-") dW^ 
Jo 

= e(t) + (7iW/+(72 /" e-« 
Jo 

Thus the short rate is a Gaussian process, and 

ft 



Jo 



dt + ai dWl + (72 (iWf 



= [e'(t) - a{rt - e(t) - (71 W/)] (it + (71 (fM^i^ + (72 (fWf 
= - an - aaiW^] dt + ai dW^ + (72 dW^ 

This is not the form of one of our standard short rate models, because of the exphcit presence 
of Wl in the drift. 



□ 



8.5 Market Models: Preliminaries 

The HJM aproach studies the entire term structure of instantaneous forward rates {f{t, T) : 
t < T}, with considerable success, as we have seen. Nevertheless, forward rates for only a 
few maturities are available in the market, so the forward rate curve, like the instantaneous 
short rate, is a purely mathematical entity, a mathematical idealization. Market models, on 
the other hand, model observable (i.e. market-quoted) rates rather than idealized entities, 
and thus simple, discrete rates. 

The London Interbank Offer Rates (LIBOR), for example, are quoted for different ma- 
turities (3-month, 6-month, etc.) and also for different currencies. These LIBOR spot rates 
imply LIBOR forward rates using an arbitrage argument. New LIBOR quotes are available 
daily. Swap rates (the fair rates for interest rate swaps) are another example of discrete 
market-quoted rates. The market model approach to interest rates dates back to Miltersen, 
Sandmann and Sondermann (1997), Brace, Gatarek and Musiela (1997) and Jamshidian 
(also 1997). Several other approaches now exist, due to Hunt and Kennedy, and Musiela 
and Rutkowski, amongst others. It remains one of the most intensively researched areas of 
financial mathematics. 

8.5.1 Black's Models 

Black's model has long been the industry-standard model used by traders to price a variety 
of European-style options, including interest rate options, such as caps, floors, and swaptions. 
It is essentially a minor variation on the Black-Scholes formula, as we shall shortly see. 
Nevertheless, the suitability and adequacy of Black's model has often been questioned by 
academics, particularly in the arena of interest rate options. 

Consider a European call option C with strike K and maturity T on some market variable 
X. X need not be a traded instrument — it could also be a market-quoted interest rate, 
for example. The main assumption is that Xt is lognormally distributed in the riskneutral 



190 



Market Models: Preliminaries 



world.. Thus we make no assumptions on the distribution of the process {Xt)t in general, but 
just on the value of X at the expiry of the option. We further define the "volatility" of Xt 
to be a non-negative number a satisfying 

variance of IuXt = a^T 



Let O be a riskneutral measure. Then the t = 0-value of the call is 



e-/o^'-t'^*(XT-i^+) 



Co = l 

Black uses two approximations to determine the value of Cq- 

• Approximate 

Eq ^e-fon dtf^j^^ _ ^)+J ^ p(o,r)EQ[(Xr - K)+] 

i.e. discount outside the expectation operator. 

• Now because Xt is lognormal under Q, we know that 

Eq[{Xt - K)+] = E[XT]N{di) - KN{d2) where 

d2 = di- u\/f 



Approximate 



Eiq[Xt] = forward price/rate oi X = Fq 



i.e. approximate the expectation by the forward price/rate. 

Since the forward price of X at time T for time T is just itself (i.e. Ft = Xt), this can be 
interpreted as saying that the forward rate process has zero drift, i.e. is a Q-martingale. 

Thus, using these two approximations, we obtain Black's model for a call on X: 

Co = P(0, T) [FoN{di) - KN{d2)] where 

inf + Vr 



A similar formula is obtained for puts, using put-call parity. 

If payments are based on a variable Xt, but only received at some later date T* , then dis- 
counting must be done from time T* rather than from time T. Black's model then generalizes 
to give call prices 

Cq = P(0, T*)[FoN{di) - KN{d2)] where 
In =^ + ^a^T 



h-aVf 



Interest Rate Modelling 



191 



where Fq is still the T-forward value of X at time t = 0. The appropriate generalized Black 
formula for put options follows once again by put-call parity. 

Now it ought to be clear Black's model has several flaws. Firstly, it cannot be appropriate 
to use the first approximation when depends on interest rates, as it amounts to saying 
that 



e-Io '-^"^^Xt-K)- 



- /(, n dt 



[{Xt - KY 



which is close to asserting that r and Xx are independent. That's a dangerous assumption 
if X happens to be an interest rate derivative! There is no justification for the second 
approximation either. The expected value of X^ under the riskneutral measure is its futures 
price, whereas the forward price is the expected value of Xt under the T-forward riskneutral 
measure. These measures are not the same if interest rates are stochastic. 

In spite of these fiaws. Black's model remains heavily used — the industry standard. The 
method can be justified, provided that the relevant variable is taken to be lognormal under 
a different measure, associated with a different numeraire. We shall give several examples of 
this below. Review material on changes of measure and numeraire may be found in the next 
subsection. 



Example 8.5.1 Bond Options: Lognormal prices 

We consider a call C with strike K and maturity T on a coupon bearing bond B. We assume 
that the bond price at time T is lognormally distributed (under the riskneutral measure) , and 
that IiiBt has variance cj^T. This "volatility" a is obtained from historical data (or implied 
by other market variables) 

The T-forward bond price Fq is simply the fair price which sets the value of a forward 
contract on B equal to zero. A simple arbitrage argument shows 

° P(0,T) 

where Bq is the current value of the bond, P{0, T) is the discount bond maturing at time T, 
and D is the present value of all coupons (dividends) paid out during the life of the option. 
Thus Black's model determines 

Co = {Bq - D)N{di) - KP{0, T)N{d2) where 
do = d^ — (j\fT 



The above call price is an approximation under the assumption that B^ is lognormal under 
Q, but exact if we assume lognormality of B^ under the T-forward riskneutral measure Q^. 
Of course, bond put options can be evaluated by put-call parity. 

□ 

^In practice, yield volatilities are often obtained. If Oy is the volatility of the yield (i.e. if ij\T is the 
standard deviation of the logarithm of the forward yield Inj/r), then (with D* — duration) we have ~ 
—D*^y = —D*yo^, i.e. A(lnB) ~ — D*yo A(ln y). Thus the variance of ln_B is approximately {D*yo)^x the 
variance of Iny, i.e. crs « D'yocTy. 



192 



Market Models: Preliminaries 



Example 8.5.2 Caps: Lognormal LIBOR Rates 

An interest rate cap is an option-like contract which protects the holder against a floating 
interest rate moving too high. Each cap is a portfolio of caplets, each for a certain future 
time interval. A caplet is essentially a call option on the floating rate, given a certain cap 
rate as strike, based on a given notional amount. Consider, for example, a five-year cap, on 
a notional amount A, with cap rate R and semiannual resets based on 6-month LIBOR. This 
is a portfolio of 10 caplets. The reset dates Tq = 0, Ti = 0.5, T2 = 1,. . . Tio = 5 are referred to 
as the tenor structure of the cap. The n*^ caplet protects the holder against 6-month LIBOR 
rising above R over the period [T„_i,r„]. It is a call option with strike R on the 6— month 
spot LIBOR L(r„_i) at time r„_i, and will have the following payoff at time r„: 

Payoff of n^^ cap = A(5n(L(T„_i) - where (5„ = r„ - T^-i 

(This is a payment-in-arrears cap. The first caplet is generally excluded from the cap, because 
there is no uncertainty about the spot LIBOR L{Tq).) 

To price the n*'^ caplet using Black's model, we assume that the future spot LIBOR 
L{Tn-i) is lognormally distributed, with volatility an-i- The t = 0-forward LIBOR rate (i.e. 
the Fo of Black's model) for the period [T„_i, r„] is given by 

,.nT ^ P(o,r^-i)-P(o,rj 

dnP{0,ln) 

(In this notation, the future spot rate, L(T„_i), is just L(T„_i, r„_i).) 
Hence the t — 0-value of the n^^ caplet is 

Cn{0) = ASnPiO,T„) [L(0,T„_i)iV(di,„_i) - RN{d2,n-l)] 

n L(0,T„-i) 12 j< ^ 
ai,n-l = frj, 

d2,n-l = rfl,n-l — CTn-l^/Tn-i 

The price of the cap is therefore the sum of the prices of the caplets (though, as we have 
mentioned, the first cap is often excluded, i.e. Ci(0) is set to zero). 

The above price for a cap is an approximation, assuming that each future LIBOR spot rate 
L{Tn) is lognormal under the riskneutral measure Q. The formula for each caplet is exact, 
however, if it is assumed that L{Tn) is lognormal under the T„+i-forward measure. For then 
indeed 

gn(0) ^ \ A6n[L{Tn-l) - R]^ " 

P(0,T„) [ P(T„,r„) 

which justifies the first approximation used in Black's model (i.e. discounting outside the 
expectation). Moreover, the second approximation is exact, i.e. the forward LIBOR rate 
L(0, r„_i) is exactly equal to the expected value of the spot rate, but under the forward 
riskneutral measure: L(0,T„_i) = Eqt„ [L(T„_i)]. To see this, note that a long forward rate 
agreement F, initiated at time t = for period [Tn-i,Tn], will have initial value Fq = 0, and 
terminal value Ft^ = J„[L(r„_i) — L(0,T„_i)]. Hence 



Ft 



which yields the required result (because L(0, r„_i) is a known constant). 



Interest Rate Modelling 



193 



So in order for the Black price of a cap to be accurate, we must simultaneously assume 
that each L{Tn) is lognormal under Q^"+i. This seems difficult to justify theoretically. One 
of the achievements of LIBOR market models is that they provide a framework under which 
these assumptions all do hold simultaneously, thus showing that the use of Black's model does 
not lead automatically to arbitrage opportunities. 

□ 



Example 8.5.3 Caps: Lognormal Bond Prices 

A cap can be decomposed into a portfolio of puts on zero coupon bonds. To be precise, the 
n^^ caplet (from the previous example) has 



Payoff = ASn[L{Tn-i) - at time T, 



Since L(r„_i) is known at time r„_i this is equivalent to a time-T„_i payoff of 

A6n[L{Tn-i) - R]- 



1 + 5nL{Tn-i) 



A[i-(i + <5„i?)P(r„_i,r„)]+ 
1 



A{1 + 5nR) 



-\ + 



l + 5nR 



P{Tn-l,Tn) 



This last line is easily seen to be the time-T„_i payoff of a portfolio of A(l + (5n-R)-many put 
options with strike ji^^—n and expiry T^-i on underlying security P{t,Tn). If at time r„_i 
the caplct has the same payoff as a portfolio of puts on r„), then, by the Law of One 
Price, the value of the caplet must have the same value as the portfolio of puts at any earlier 
time as well. 

Thus the t = 0-value of the n*^ caplet is 



Cn(0) = A{1 + SnR)x value of put option on P{t,Tn) with strike 



1 



1 + SnR 



and expiry r„_i 

This can be evaluated using the method of the first example of this subsection. 



□ 



Example 8.5.4 Swaptions: Lognormal Swap Rates 

Suppose we initiate, at time t, a pay-fixed interest rate swap starting at time T > t, with 

tenor structure T = Tq <Ti < ■ ■ ■ < Tjv on a notional amount A. This is known as a forward 
swap or deferred swap. Let 5n = Tn — T^-i, and recall that at r„ pay-fixed receives 

ASn{L{Tn-i) - St,T) n = l,...,N 

where St^r is the T-forward swap rate at time t, and L(Tn-i) is the spot LIBOR rate at time 
T„_i for the period [r„_i,r„]. Further recall that St^r is the rate which sets the initial (i.e. 
time t) value of the forward swap equal to zero. 

The interest payments on a pay-fixed swap are equivalent to the payments of a portfolio 
consisting of short a coupon bond with coupon rate Sf^T, and long a fioating rate note. The 



194 



Market Models: Preliminaries 



bond and the FRN both come into existence at time T. The current value of such a forward 
starting bond bond is 

N 

A[J2 SnSt,TP{t, Tn) + P{t, Tjv)] 

n=l 

The floating rate note will trade at par at time T, i.e. we need to set aside AP{t, T) at time 
t to purchase the FRN at time T. Hence the forward swap rate satisfies 

N 

-A[J2 SnP{t, Tn)St,T + P{t, Tn)] + AP{t, T) = 

n=l 

(where the coupon bond and FRN have the same payment dates as the swap, and the same 
notional) and thus 

^ ^ Pit,T) - Pit,TM) 

En=l<5nP(i,r„) 

lit = T, then St,t is just the ordinary spot swap rate at time t. 

A swaption C is the right to enter into a pay-fixed swap at some future date T at a strike 
rate R. If the tenor structure is T = Tq < Ti < T2 < ■ ■ ■ < Tn, then the swaption gives the 
holder the right (but not the obligation) to receive at each of the dates Ti , . . . , Tn an amount 

A5n{L{Tn-i) - R) 

If a pay-fixed swap were to be entered at time T at the spot swap rate, then payments would 
be 

A6n{L{Tn-l) - St,t) 

and thus the swaption would be exercised only if i? < St,t- The swaption thus gives rise to 
a series of payments 

A6n{ST,T - R) + 

at times T„. Each payment is equivalent to the payoff of A(5„-many calls with strike R 
and maturity T on underlying St,t- Using the generalized version of Black's model, i.e. 
assuming that St,t is lognormal under the riskneutral measure and making the appropriate 
approximations, the t = 0-value of each such payment is 

ASnP{0,Tn) [So,TN{di) - RN{d2)] 
In^ + ia^T r- 

where di = > d2 = di ~ (tVT, and a is the volatility of the future spot swap rate 

St,t- Hence the value of the swaption is 

TV 

Co = Y^ ASnP{0,Tn)[So,TN{di) - RN{d2)] where 

n=l 

ln%^ + ia^T 

= ^ 

d2 = di- aVf 

and 

^ ^ Pit,T) - PjtM 

Eti^nP(o,r„) 



Interest Rate Modelling 



195 



We saw that we can make the Black formTila for caps exact, provided we work with the 
appropriate numeraires, under the appropriate equivalent martingale measures. Can we make 
Black's formula for swaptions exact? Yes, indeed. Note that the numerator in the expression 
for T is equivalent to a portfolio of zero coupon bonds, i.e. 



N 



^<5„p(o,r„) 



n=l 

corresponds to a stream of cashflows of size Sn at time T„. If, as is often the case, all the (5„ 
are of the same size, then this portfolio is just an annuity. Now we may think of the portfolio 
as a traded asset, call it X, and use it as numeraire. 

The first of the Black approximations is exact under the measure Qx- The time-T value 
of all the payoffs of the swaption is 

N 

Ct = Y^ ASnP{T, Tn)[ST,T - = AXt[St,t - 

n=l 

Hence 

E^,[AiST,T-Rr] 



so that 

N 



Co = J2 A5nP(.0, T„)Eq^ [{St,t - R)- 



n=l 

i.e. we discount outside the expectation. 

As for the second approximation, we need to show that the forward swap rate 5*0, r (which 
can now be seen to equal j j^^g^ ^j^g expected value of the future spot swap 

rate St,t under the EMM <Qx, i-e. that Eq^[St,t] = So,t- To see this, consider a pay- 
fixed forward swap F initiated at t = to start at time T, with interest payment dates 
Ti,...,Tjv. The t = 0-value of the contract is Fq = 0, whereas at time T the value is 
Ft = J2n=i ^^nP(T,Tn)[ST,T — 5'o,t] = AX'r[ST,T — Sq^t]- The desired result now follows 
immediately from the fact that 







Ft 
Xt 



Hence Black's formula is exact, provided we assume that swap rates are lognormally dis- 
tributed under the EMM associated with the annuity process Xt = J2n=i ^nP{t,Tn). 

□ 



It's pretty amazing that the Black formula for various derivatives (published in 1976) can 
in many cases be made exact using the change of numeraire technique (discovered in the early 
1990's). In particular, both the Black formula for caps and that for swaptions are exact if we 
assume that LIBOR rates are lognormal under the appropriate forward riskneutral measures, 
and that swap rates are lognormal under the "annuity" measure. 



196 



Market Models: Preliminaries 



8.5.2 Review of Changes of Meaisure and Numeraire; LIBOR Rates 

Fix a horizon T* > and suppose that (17, J^, P, {J-t)t, (SDi^t) is a market model, where the 
fihration {J-t)t is generated by a standard (muhi-dimcnsional) P-Brownian motion {Wt)t, 
augmented to satisfy the usual conditions. Let Q be the riskneutral measure, i.e. a measure 
which has the property that all asset price processes SI are martingales when denominated 
in units of the money market account At. We briefly recall some facts about how Girsanov's 
Theorem is used to change the measure (e.g. to construct Q from P): 

• Assume that the asset dynamics are given by 

= ix\t, St) dt + a\t, St) dWt = rtAt dt 

Ot 

with suitable initial conditions. 

Recall that the market price of risk Af is a vector satisfying 

ai-Af = Mj-n 

(This looks like it depends on the asset S"*, but we know from previously developed 
theory that, for a model to be arbitrage- free, all assets must have the same market 
price of risk. Hence we've suppressed an index i.) 

• Let u{t, uj) be a predictable process, to be used as a kernel for a Girsanov transformation. 

• Define a new measure P by 




• Girsanov's Theorem states that 

Wt = Wt- f usds 
Jo 

is a P-Brownian motion. 

• Thus the new asset dynamics are, under P,given by 

-^ = U + aiut) dt + 4 dWt V = 

It follows that the market price of risk under P must satisfy the relation 

>h4 = l4 + 4ut -rt = (Af + ut)(ji 

and thus 

x! = Xf + ut 

• Hence a Girsanov transformation adds the Girsanov kernel to the market price of risk. 
It adds volatility x kernel to the drift. 



Interest Rate Modelling 



197 



• To obtain a riskneutral measure Q, the new market price of risk must be zero, and 
thus we must have ut = — Af . This is in agreement with what we found ear her. In that 
case, the drift becomes — a^Af = r^, which we already know very weh. 

• To change from the riskneutral measure Q to an equivalent martingale measure Qx 
for numeraire X, we proceed as follows: Start in the riskneutral world, where ^ = 

r dt + as dW^, and ^ = r dt + ax dW^. Under Qx, the ratios St = are 
martingales. Now under Q, the ratios have dynamics 

^ = -<^xi<7s - <^x) dt + {as - ax) dW^ 
St 

= -axcr dt + a dW^ 

(where a = as— ax)- To make the drift equal to zero (i.e. to make St into a martingale), 
we need to to add a x ax = volatility x ax , i-e. we need to use a Girsanov transformation 
with kernel ax- Thus 

dQx 



SrifaxdW^) 
Jo 



• Hence we need to add ax to the riskneutral market price of risk to obtain the market 
price of risk under Qx- Since the riskneutral market price of risk is zero, the market 
price of risk under Qx is just the volatility of the numeraire X. 

• Numeraire-denominated asset price dynamics under the associated equivalent martin- 
gale measure are therefore just 



dSt 
St 



= {crs - crx) dW[ 



If the numeraire is the T-bond P{t,T), the associated EMM is called the T-forward 
riskneutral measure, and denoted by Q^. If bond price dynamics are 

^^^=f,s{t)dt + as{t)dW! 

under the "real-world" measure P, then the numeraire denominated dynamics are given 

by 

= {as - ar) dW^ 
P{t,S) ^ ' * 

where P(t, S) = p|^'yj and is a Q-^-Brownian motion. 

• Given future times T < S, the market price of risk under Q^ is just as, whereas the 
market price of risk under Q-^ is ar- To move from Q'^-world to Q-^-world, we must 
change the market price of risk from 175 to ax, i-e. we need to add a^ — as to the market 
price of risk under Q'^ . Hence the change from Q'^-world to Q^-world is effected by a 
Girsanov transformation with kernel ax — as, i.e. 

-^=£T[^j^aT-asdW, 



198 



Market Models: Preliminaries 



We can also verify this directly. Recall that the Radon-Nikodym process = I^qs 
for a change of numeraire is given by a ratio of asset ratios: 

_ P{t,T)/P{t,S) 
^* P{0,T)/P{0,S) 

and thus 

The solution of this SDE, together with the initial condition = 1) is just = 

• Finally, note that the asset ratio process P{t, T) = p|^'gj satisfies the same SDE as does 

= [ar{t) - as{t)] dwf 

although their initial conditions may differ. Hence and P{t, T) differ by a constant 
factor, i.e. 



6 = cP(t,r) 



P{t,S) 



Let r* > be a a horizon for our bond market model. The time-t forward LIBOR rate 
for the future interval [T, T + 5] (where T <T* -5)is defined by 

l + 5L(i,r) = ^^^^^^^^ i.e. Lit,T) = sP{t,T + 6) 

We saw earlier that L(t, T) is the interest rate for the period [T, T + 5] that can be locked in 
at time t (by a judicious investment in a portfolio of T- and T + (5-bonds with zero initial 
cost). 

Alternatively, the forward LIBOR rate L{t,T) can be regarded as the swap rate for a 

single-period swap settled in arrears. For suppose that we have a single-period interest rate 
swap, contracted at time t, for the period [T,T + 5], to be settled at time T + 6. Thus, at 
time T + 5, the pay-fixed side pays 6R, and the receive-fixed party pays P~^{T, T + 6) — 1, 
where R is the fair swap rate, and P~^{T, T + 5) = 1 + SS, S = L{T, T) the spot rate at time 
T for period T,T + 6]. Equivalently, by adding 1 to both payments, pay-fixed pays Y^^ and 
receive-fixed pays Y-l'\ where 

y/^ = l + ,5i? Y^^ = P-'^{T,T + S) 

We can regard Y^^ and Y^^ as contingent claims which are paid out at time T + S. It is clear 
that the time i-value of Y^^ is just 

y/=^ = P{t,T + 6)[1 + 5R] 

The time-t value of y^^' is obtained as follows: If, at time T, we invest $1.00 in T + (5-bonds, 
the payoff at time T + 6 will be P^^(T, T + 6). To obtain the required $1.00, we must invest 
in one T-bond at time t <T. Hence 

Y/' = P{t,T) 



Interest Rate Modelling 



199 



The swap rate at time t is the rate R for which Y^^ = Yf'', and thus R = ^'^^'Jpfyr+S)^^^ ~ 
L{t,T). 
Define 

P{t,T,S) = p^/' = 1 + SL{t,T) for t<T < S andS = S-T 

Then P{t, T, S) is a Q'^ -martingale. In particular, the LIBOR forward rate L{t, T) is a Q-^+^- 
martingale. Thus the LIBOR forward rate L{t,T) is simply the expected value of the LIBOR 
spot rate L{T, T) at time T, where the expectation is taken under the Q'^+'^-measure. 



8.6 Lognormal Forward LIBOR Market Models 

We start with a pre-specified sequence of times 

= To < Ti < < ■ • • < Tat = T* 

These times, typically settlement- or reset dates, are collectively known as the tenor structure. 
We also define 5j = Tj — Tj_i for j = 1, . . . , N. Then the forward LIBOR rate satisfy 

1 + 6j+,L{t,Tj) = = P{t,Tj,Tj+,) 

}^{t,lj+i) 

We assume that the bond market satisfies a strong form of the no-arbitrage condition, i.e. 
we assume that there exists a riskneutral measure Q simultaneously for all discount bonds 
P{t,T). We denote, for each P{t,T), its associated forward riskneutral measure by Q-^. Wt 
and Wj^ will denote, respectively, Q- and Q-^-Brownian motions. 

Let S{t, T) be the volatility of the T-bond P{t, T) at time t. From the previous subsection, 
we know the following: 



is obtained from Q-^^+i via a Girsanov transformation with kernel S(t, Tj)—S{t, 

£t, ^ S{t,T,) - 5(t,r,+i) dl^p+^) 



I.e. 



Each asset ratio P{t,Tj,Tj^i) = p^/y^||) is a Q-^^+i-martingale. 

Each forward LIBOR rate L{t,Tj) is a Q^j+i -martingale. 
The Q-^J+i-dynamics of the asset ratio P{t,Tj,Tj^i) are 
dP{t,Tj,Tj+i) _ ^^^^ ^_ 



P{t,Tj,Tj+i) 



{S{t,Tj) - Sit,Tj+,)) dW; 



• There is a constant c such that the Radon-Nikodym process and the asset ratio process 
are related 



cP{t, Tj,Tj+i) = c(l + 6j+iL{t, Tj)) 



200 



Lognormal Forward LIBOR Market Models 



Note that, assuming that the forward LIBOR rate processes L{t,T) are strictly positive, we 
have the following dynamics: 



dL{t,Tj) = L{t,Tj)X{t,Tj) dWt 



This follows from the Martingale Representation Theorem: L{t,Tj) is a Q-^^+^-martingale, 
and thus we must have dL{t,Tj) = ht dW^^^^. Since L{t,Tj) is strictly positive, we may 
define X{t,Tj) = frachtL{t,Tj) to obtain dL{t,Tj) = L{t,Tj)\{t,Tj) dWp+\ 

NowP(t,Tj,Tj+i) = l+Sj+iL{t,Tj), so that dP{t,Tj,Tj+i) = 5j+i dL{t,Tj) = Sj+iL{t,Tj)X{t,Tj) dw} 
We also found that '^p^It^.t^.^^-) = {S{t,Tj) — S{t,Tj+i)) dW^^^^, and equating these expres- 
sions yields 

1 + dj+iL[t,lj) 

This expression will play an important role in the inductive construction of lognormal models 
of forward LIBOR rates. 

Since the move from Q-^J+i -world to Q"^^ -world is accomplished by a Girsanov transfor- 
mation with kernel Sit, Tj) - S{t, Tj+i) = "^'tiytf^ffff^ ' dynamics of L{t, Tj) under Q^J 
are given by 

dL{t, T,) = Lit, T,) dt + Mt, r,) dwp 

because volatility x kernel must be added to the Q-^J+^-drift of Lit,Tj), while leaving the 
volatility unchanged (and the drift is zero, while the volatility is A(t, Tj)). 

8.6.1 The Brace-Gatarek-Musiela Approach to Forward LIBOR 

In most markets, caps and floors form the largest component of an av- 
erage swap derivatives book. . . . Market practice is to price the option 
assuming that the underlying forward rate process is lognormally dis- 
tributed with zero drift. Consequently, the option price is given by the 
Black futures formula, discounted from the settlement data. 
In an arbitrage-free setting, forward rates over consecutive intervals 
are all related to one another, and cannot all be lognormal under one 
arbitrage-free measure. That is probably what led the academic com- 
munity to a degree of skepticism toward the market practice of pricing 
caps. . . 

The aim of this paper is to show that market practice can be made con- 
sistent with an arbitrage-free term structure model. . . This is possible 
because each rate is lognormal under the forward (to the settlement 
date) arbitrage-free measure rather than under one (spot) arbitrage- 
free measure. Lognormality under the appropriate forward and not spot 
arbitrage-free measure is needed to justify the Black futures formula with 
discount for caplet pricing. 

— Brace, Gatarck, Musiela [1997] 

The BGM-model starts from a family T) of discount bond prices up to some horizon 
maturity T* . We assume that each forward rate is over a period of length 8 (the same for all 



Interest Rate Modelling 



201 



rates). The bond price processes also give us the bond ratio processes (i.e. forward prices) 
P{t,T,S) = JJgJ. The forward LIBOR rates L{t,T) are thus defined by 

l + SL{t,T) = P{t,T,T + S) forT<T*-S 

BGM put their model inside the HJM framework, i.e. they assume that a term structure 
of instantaneous forward rates for all maturities (less than the horizon date T*) is available. 
In contrast, the Musiela-Rutkowski and Jamshidian approaches require forward rates only 
for a discrete set of tenor dates, as we shall see. Now recall that if, in an HJM model, the 
riskneutral dynamics of the instantaneous forward rate f{t,T) is given by 

df{t,T) = a{t,T) dt + a{t,T) dWt where a{t,T) = a {t,T) a{t,u) du 
(using the HJM drift condition), then the riskneutral bond price dynamics are given by 

^^j'^'r^} =rtdt + S{t, T) dWt where S{t, T) = - I a{t, u) du 
Further recall that earlier we obtained 

ji^A(t,T) = S(,,r)-S(,T + i) 

(which also follows if we apply Ito's formula to the identity 1 + 6L{t,T) = e^T^ /(*'") and 
compare the c/Wj-terms). The main problem is this: 

How can we specify bond volatilities S{t, T) (or equiva- 
lently, the instantaneous forward rate volatilities a{t, T) = 
_as{i/r)-j that the resulting discrete simple forward LI- 
BOR rates will have the desired deterministic volatity struc- 
ture? 

We have already seen that L{t,T) is a non-negative Q"^ "'"'^-martingale. For l + dL{t,T) = 
P{t, T,T + 6), and so dL{t, T) = S-'^dP{t, T,T + 5). But P{t, T,T + 6)isa Q^+'^-martingale 
(by definition ofQ^+'^), with dynamics '^p^I^t^t+S) = [S{t,T)-S{t,T+6)] dW^+\ It therefore 
follows that 



dL(t, T) = d-^P{t, T,T + S)[S{t, T) - S{t, T + S)] dW^^^ 



I.e. 

dL{t,T) = L{t,T)X{t,T) dW^+^ 

where A(,r) = H + T) - S(t,T + ^)] 

dL[t, T) 

We are therefore able to derive the forward LIBOR dynamics directly from the bond price 
volatilities (or, equivalently, the instantaneous forward rate volatilities). Since the forward 



202 



Lognormal Forward LIBOR Market Models 



riskncutral measure Q-^^"*^ is obtained from the (spot) riskneutral measure Q by a Girsanov 
transformation with kernel S{t, T + 6), we have 

dW^+^ = dWt - S{t, T + S)dt 

for a Q-Brownian motion Wf. Thus the riskneutral drift is directly determined by the volatil- 
ity structure (as it is in the HJM model), giving riskneutral forward LIBOR rate dynamics 

= -A(i, T) ■ S{t, T + d)dt + X{t, T) dWt 

Now suppose that we want to create an HJM model in which forward LIBOR rates L{t, T) 
have a deterministic volatility structure \{t,T). Above, we found that 

S{t, T) - S{t, T + 6) = a{t, u) du = ^^^^^ A(t, T) 

(where S and a arc the bond and instantaneous forward rate volatilities respectively). In 
order to find the bond volatilities, it is necessary to impose some additional conditions. Set 

a{t, u) = when < u — t < 5 

(This is the fundamental assumption made in BGM(1997)). 
Now find the bond volatilities by a recursive procedure: 

• Choose n such that n6 < T — t < {n + 1)6. Equivalently n = sup{/c & N : kd < 
T — t} = [S~^{T — t)] (where [x] is the integer part of x). 

• Then S{t, T — nS) = — a{t, u) du = 0, because < u — t < 6 when t < u < 
T-n6. 

• Thus 

S{t, T) = [S{t, T) - S{t, T-6)] + [S{t, T-5)- S{t, T - 25)] + ... 

... + [S{t, T-{n- 1)5) - S{t, T - nS)] 

implies 

6L{t,T-5) SL{t,T-26) 
^) = -l + 6L{t,T-S)^^'^ l + 5L{t,T-25)^^'^ T - 25) - . . . 

.Xit,T-n5) 



"' 1 + 5L{t, T - n5) 
• i.e. 

fe=i 

Equivalently, 
(i) Define S{t, T) = ioi < T - t < 5. 



Interest Rate Modelling 



203 



(ii) Then define S{t, T) = S{t, T - 5) - ilshttlr 8) ^{t, T - 5) kv S < T - t < 25. 

(Note that if 6 <T -t <2d, then < (T - d)-t<S,so S{t, T - 6) has already been 
defined.) 

(iii) Then define S{t, T) = S{t, T - 5) - j|§^|^^A(t, T - 5) for 25 < T - t < 35. 

(Note that ii25 <T-t <36, then 5 < {T-6)-t < 25, so S{t, T - 5) has already been 
defined.) 

(iv) . . . etc. 

In this way, if we specify bond volatilities by this forward induction, then we will have an HJM 
model in which the forward LIBOR rates L(t, T) have the required deterministic volatilities 
X{t,T). Since each L{t,T) is a strictly postive Q-^+'^-martingale, it follows that each L{t,T) 
is lognormal under Q"^"*"*, and thus that the Black formula for caps is valid in this model. 

8.6.2 The Musiela-Rutkowski Approach to Forward LIBOR 

Unlike the BGM-approach, which lies within the HJM framework and specifies a model 
of forward LIBOR rates L{t,T) for all maturities T (below the horizon T*), the Musiela 
Rutkowski (MR) approach only specifies LIBOR rates for a discrete set of maturities. We 
start with a discrete tenor structure 

< To < Ti < • • • < Tiv = T* 5n = Tn- T^-i 

and define T-i = (for ease of handling certain formulas). We further assume that we are 
given 

• A family of bounded adapted processes A(t,r„) for n = 0,. . . ,N — 1 which represent 
the volatilities of the forward LIBOR rates L{t, r„). 

• An initial term structure P(0, T^) of discount bond prices (used to specify the initial 
conditions of the SDE's which wc will write down for the LIBOR rates). We further 
assume that P(0, Tq) > P(0, Ti) • • • > P(0, Tn). 

In contrast to the BGM approach, we do not need a bond price dynamics at all, i.e. we will 
attempt to model LIBOR rates directly. 

Before we construct the MR model of LIBOR rates, a lemma which will prove useful 

Lemma 8.6.1 If X,Y are adapted processes 

dXt = at dWt dYt = Pt dWt 

and if Zt = , then 

d{ZtXt) = Zt{at - PtZtXt) ■ {dWt - PtZt dt) 
i.e. d{ZtXt) = rit ■ {dWt - /3tZt dt) 

for some process r]t- 

Proof: A straightforward application of Ito's formula. 



204 



Lognormal Forward LIBOR Market Models 



□ 

Whereas the BGM approach shows how to define bond volatihties by forward induction, 
the MR approach directly constructs a set of measures under which forward LIBOR rates 
have the required volatihty structure by backward induction. It is therefore convenient to 
introduce the fohowing backward notation. Put 

Tfc* = Tjv-fe so that T* = > > ■ ■ ■ > = Tq 

We start by working under a Tjv-forward riskneutral measure Q-^^ = Q'^o, together with 
a Q-^^-Brownian motion W'^'^ = " . it is not necessary to construct this measure: we 
can assume that Q"^^ is the measure P which governs our model, and that W'^^ is the Wt 
which drives the economy. Ultimately, we will be able to specify all the dynamics under this 
measure, the terminal measure. Let L{t,T() = Lit^T^-i) be a process which satisfies the 
SDE plus initial value 

dL{t,Tl) = L{t,T^)X{t,T^) dWf'' 

This defines the forward LIBOR rate L{t,Ti) = L(i,T/v-i) in the MR model. 

We now use this to define the forward LIBOR rate L(f,T|) = L(t,Tjv-2)- To do so, we need 

to construct the forward riskneutral measure for maturity r|. Under Q'^^ , all the bond ratios 

P(t T*) 

p^^'j,^j are martingales. Now define the ratio 

UN-n+iit,Tk) = or, equivalently Un{t,T;) - ^^^'^^^ 



and note that each Un{t,T^) is required to be a martingale under the measure Q^^-i (which 
we must still construct). Further note that 



U2{t,n) = 



uiit,T*: 



l + SNL{t,TI) 
so that by the lemma, 

for some process r]k,t (whose exact nature is not important right now). In order for each 
i72(i,T'fe) to be a martingale, it suffices to find a measure under which 

Wr = = Wj'^ - f t|^%^A(.,T*) ds 

Jo i- + OnI'{^,-^i) 

is a Brownian motion. This is possible if we perform a Girsanov transformation from Q'^^ = 
Q^o with kernel 7(5, Tf) = i^f^^^l) A(s, Tf ), i.e. if we define 



Interest Rate Modelling 



205 



We now let L(t,r|) be a process which solves the SDE and initial condition 

dLit,T^) = Lit,T^)X{t,T^) dwj^ 
wn ^.^_ m^|)-P(0,^f) 
<5iv-iP(0,T*) 

We continue in this way: Suppose that wc have already constructed the LIBOR rate processes 
L(t, Tl), . . . , L{t, T*), for n < — 1. Suppose further that this has been done so that each 
forward measure and Brownian motion has been specified, in particular that we have already 
constructed Q^"-i and wf^'S and that dL{t,T*) = L{t,T*)X(t,T*) dw'f"'^ under Q^"-i. 
We must now construct a measure Q-^" and an associated Brownian motion wf" . We require 
that each Un+i{t,T^) is a Q-^™ -martingale. Now 



Unit,T*] 



1 + SN-n+lHt,T*) 

Using the lemma, we see that 

for some process rjk^t (whose exact nature is not important right now). In order for each 
Un-\-i{t,T^) to be a martingale, it suffices to find a measure under which 

is a Brownian motion. This is possible if we perform a Girsanov transformation from Q"^"-i 
with kernel ^{s,T*) = ^i|^^±i^ig^A(s, T*), i.e. if we define 



St;: [I ^M)dWt 



We now let L{t,T*_^_^) be a process which solves the SDE and initial condition 



We have now constructed a sequence of processes L{t,Tn) which are models of the forward 
LIBOR rates, with the desired volatilities. Since we also know the Girsanov kernels of each 
transformation, we can specify all LIBOR rate dynamics under the terminal measure. Induc- 
tively, 

dL{t,T*) = L{t,T*)X{t,T*) dWr^-' 

= -Lit, T*)\{t, T^t, r:„i) dt + L{t, T*)\{t, T*) dW^"-' 

= -L{t,T;)X{t,T;Mt,T;_,) + ^{t,T;_,)] dt + L{t,T*)X{t^T*) ^^^f""^ 

n-l 

= -L{t, T*)X{t, T*) j{t, TU) dt + L{t, T*)X{t, T*) dW^° 



k=l 



206 



Lognormal Forward LIBOR Market Models 



where 



and hence, when we translate from backwards time to ordinary time, 

The Musiela-Rutkowski forward LIBOR rate dynamics under the ter- 
minal measure Q-^^ are given by 

^ l + 5k+iL{t,Tk) 



N-l 

dL{t, r„) = -L(t, Tn)\it, Tn) ^ "7V'r' ^77/^^7' dt+Ht, Tn)X{t, Tn) dWt 

k=n+l 



This must be solved recursively: First find the solution for L(t,r/v-i). 
Once this has been found, find the solution for L{t, rjv-2)- Note that the 

SDE for Tjv-2) also contains L{t,T]\f_i), but we've already found 
that. Then solve the SDE for L{t, T/v-3) (which contains L(t, and 
L(t,TN-2)', these have been determined). And so on. . . 

It is therefore possible to find a model in which LIBOR rates have the required volatilities 
X{t,Tn). If these volatilities are deterministic, then each L{t,Tn) will be lognormal under 
qT„+i ^Yiat case, the Black formula for caps will be exact. 



8.6.3 Jamshidian's Approach to Forweird LIBOR 

Like the Musiela-Rutkowski approach, Jamshidian(1997) does not require bond price dynam- 
ics, and models LIBOR rates for a discrete set of tenor dates = T-i < Tq < Ti < ■ ■ ■ < 
r^r = T* via a backward induction. But instead of working under the terminal measure, 
Jamshidian defines a spot LIBOR measure. This measure is obtained if we take as numeraire 
a certain portfolio of zero coupon bonds with unit initial value. 

We begin by observing that the prices of discount bonds are not completely determined 
by the forward LIBOR rates. This is true at tenor dates, but if t lies between tenor dates, 
e.g. Tn<t< r„+i, then P(i,r„+fc) = P(i,T„+i) • i+5„+,L(t,r„+,_i) " ^hus 

knowledge of the LIBOR rates is not enough — we also have to know the discount factor 
to the next tenor date (i.e. P(t,T„+i)). By working under the spot LIBOR measure, this 
problem can be circumvented. 

Consider the following portfolio of discount bonds X. Its initial value is $1.00. At all 
subsequent times, all wealth is invested in the next-to-mature bond. Thus at t = 0, $1.00 is 
invested in P{t,To). At To, the payoff of these bonds is reinvested in P{t,T\) and at Ti, the 
payoff is reinvested in P{t,T2), etc. Thus at time T„, the value of the portfolio is 

^ P{Tn,Tn+l) 

^" p(o, To) • P(ro, Ti) p(r„, P(r„+i) 

= value of r„+i-bonds x no. of r„+i-bonds 



An instant later, when <t < T„+i, the value Xt of the portfolio is simply 



Xt 



p(o. To) • p(ro, Ti) p(r„, p(T„+i) 



Interest Rate Modelling 



207 



because the value of the T„+i-bond has changed, but the number of T„+i— bonds in the 
portfoUo has not. Hence 

n{t) 

Xt = p{t,T^i^t)) ■ n p-\n-i,Tk) 

k=0 ^ ' 

where n{t) = inf{n : r„ > t} 

A spot LIBOR measure Qx is obtained by taking Xt as numeraire, so that each asset ratio 
process ^^^^"-^ is a Qx-martingale. The asset ratios can be written as 

P{t,Tn+l) ^ Pit,Tn^t))Uk=nit)+li^ + SkHt,Tk-l)r^ 

Xt P{t, r„(,)) n';2(i + SkL{Tk-i,n-i) 

n{t) n 
k=0 k=n(t)+l 



n 



= ll{l + SkL{tATk^i,Tk-i)r^ 
k=0 

Hence the prices of the asset ratios are completely determined by the LIBOR processes. 
We now aim to describe the LIBOR rate dynamics under the spot LIBOR measure Qx, 
and that this requires knowledge only of the LIBOR rate volatilities (and not, say, bond or 
instantaneous forward rate volatilities as well). For the moment, assume that bond price 
dynamics are given by some Ito processes 

= mit, Tn) dt + S{t, Tn) dWt 

under the "real-world" probability measure P. By definition of Xt (i.e. by (*)), we see that 

^ = mit, T^^t)) dt + S{t, T^^t)) dWt 
Moreover, if we apply Ito's formula to 1 + (5„+iL(i,T„) = we see that 



dL{t,Tn) 



(m{t, Tn) - m{t, Tn+l) - {S{t, Tn) - S{t, Tn+l))S{t, Tn+l)) 



Sn+lP{t,Tn+l) 

- (^S{t,Tn)-S{t,Tn+l)] dWt 
= H{t,Tn) dt + C{t,Tn) dWt 

where 

/^(*' Tn) = , ^^u'il . \m{t, Tn) - m{t, Tn+l) ) - C{t, Tn)S{t, Tn+l) 



dt 



Sn+lP{t, Tn+l) 

Tn) = ^^f ''^ (sit, Tn) - Sit, Tn+l)) 
dn+lPit, Tn+l) V / 

It follows that 

Sk+lCit,Tk) 



rC — TXiZ ) 



208 



Exercises 



for j > n{t). 

If 7(t) is the Girsanov kernel for transforming P to Qx, i-6. if 

Pit T \ 

then "-x^ lias zero drift under Qx- But 



d 



Xt 



Xt 



(m(t,Tn) - m{t,Tn(^t)) - Sit,Tn(t)) ■ iSit,Tn) - S{t,Tn(t)))^ dt 
+ (^S{t,Tn) - Sit,T„^t))) dWt 

Now in the Girsanov transformation, {S{t,Tn) — ^(i, r„(t))) • 74 is added to the P-drift to 
obtain the Qx^drift, which is zero, and so 

m{t,Tn) - m{t,T^^t)) - S{t,T^^t)) ■ (-5(t,r„) - Sit,T^^t))) + iS{t,Tn) - S{t,T^^t))) ■ It = 
which yields 

m{t,Tn) - mit,Tn+i) = [s{t,T^^t)) - 7*) • {s{t,Tn) - S{t,T^^t))) 
for n = 0, . . . , A/". It follows that 

m{t,Tn) - m{t,Tn+i) = (rn{t,Tn) - m{t,Tn{t)^ - (m{t,Tn+i) - m(t,Tn(t) 
= r„(,)) - 7t) • Tn) - S{t, Tn+l)) 

Now multiply both sides of this equation by ^ _^{plt T +1) obtain 
Looking back to the definitions of n and ( in the dynamics of L{t,Tn), we see that 

fl{t, Tn) = C{t, Tn) {S{t, T^^t)) -It- Sit, Tn+l)) 

and hence 



dL{t,Tn) = Cit,Tn) 



(^S{t, r„(t)) - S(t, Tn+i) - -ft) dt + dWt 



These are, of course, the P-dynamics. To get the Qx-dynamics, we must add volatility x 
kernel = Ct ' 7t to the drift to obtain 



dLit,Tn) = at,Tn) 



T„(t)) - S{t, Tn+l)) dt + dWi" 



where = W^t ~ /q lu du is a Qx-Brownian motion. Finally, using (**), we obtain 
dL{t,T^)= f: %^||^%^|^di + C(i,T„)dW^,^ 



These are the forward LIBOR rate dynamics under the spot LIBOR measure. 



Interest Rate Modelling 



209 



8.7 Exercises 

1. An endowment option X is a very long term European call option. Typically, 

• At issue, the initial strike Kq is set to approximately 50% of the current stock price. 

• The options are inflation and dividend protected: 

— The strike price increases at the short term riskless rate. 

— The strike price is decreased by the size of the dividend each time a dividend is 
paid. 

• The payoff at expiry T is Xt = {St - Kt)'^. 

We will make the simplifying assumption that the stock pays no dividends. This can 

be accomplished by regarding the stock price as the theoretical price of a mutual fund 
which starts off at one share, and reinvests all dividends in that share. We have, in the 
risk-neutral world, 

dSt = nSt dt + atSt dWt dAt = nAt dt 

where S is the share and A the money market account (with Aq = 1). Clearly Kt = KoAf. 
By changing the numeraire to At, show that, when the volatility at is deterministic, 

Xo = SoNid+)-KoNid-) 

where 

d± = 7= and (t„^ = - / at dt 

CTavV-L Jo 

2. Use the change-of-numeraire technique to show how to calculate the value of an option 
which pays the minimum of two assets S^,S^. Assume that the "real world" dynamics of 
the assets are Ito diffusions of the form 

dSi = Siim dt + di dWi] 

where /ii,o".t arc constants, and that the correlation of returns is a constant p. Further 
assume that SI has a continuously paid dividend with constant dividend yield . 

3. Consider a European call C on share S traded on FTSE. St and C are priced in pounds, 
but the strike of the call is in dollars. Initially, the option is at-the-money. The dollar 
strike does not change, but because exchange rates are not fixed, the pound strike does. 
Let Xt be the ^g-rate, Yt the ^^-rate. Assume dynamics 

dSt = asSt dt + SsSt dWf 
dXt = axXt dt + dxXt dW^ 
dYt = ayYt dt + SyYt dW^ 

where W'^,W^,W^ are correlated Brownian motions. 
3.1 Apply Ito's formula to show 

dYt = ayYt dt + 6yYti-dWt^) ay = -ax + 4, = Sx 



210 



Exercises 



3.2 Let p be the correlation between and . Let Wt = {Wl,W^) be a two- 
dimensional standard Brownian motion, and rewrite the above dynamics 

dSt = asSt dt + Stas dWt 
dXt = axXt dt + Xtax dWt 
dYt = ayYt dt + Ytay dWt 

Show that we must have 

11^ l|2 11^ l|2 r2 11 ||2 e2 

(^x • o"5 = pSx^s (^Y ■ crs = -pSxSs 

3.3 The initial pound strike is Kq = So, and the initial dollar strike is K"^ = SqXq (at-the- 
money), which remains fixed. At maturity, the pound strike is Kt = K'^Yt- Define 
Sf = StXt to be the dollar price of S at time t. Show that 

dSt = Sf[as + ax + (Ts- ax] dt + Sfiag + ax) dWt 

3.4 Now convert this to a system with a one-dimensional Brownian motion V^: 

dSf = Sf[as + ax + as- ax] dt + Sfdg^i dVt 

where 

Sid = \\ax + asW'^ = (4 + + ^pSxSs) 

3.5 Now we have a plain vanilla call on an asset S"^ with (fixed) strike K"^. Find the dollar 
price of this option: 

= S'^N{d+) - e-^''^'^-^^K'^N{d-) 
where where is the riskless dollar rate, and 

ln|| + (r<i±i(5|,)(r-t) 

d± = " 



Sgd-s/T — t 

3.6 Conclude that the pound price of the option is 

^ ^ In ^ + (rd±h (5l + 61 + 2p6xSs))(T - t) 
Ct = StN{d+)-e-'''^^-'^^Nid-) d± = ^ ^ '^ ^ - 

^{5l + 5l + 2p5x5s){T-t) 

3.7 If we had tried to price the option directly in pounds, we would have had (explain 
this) 

Ct = {St - So{Yt/Yo))+ 

Very naturally, we would have considered the numeraire Yt. This would have been a 
mistake, for although Yt is a traded asset (namely the pound price of a dollar note), 
this is not a non-dividend paying asset: Yt has a continuous dividend yield equal to the 
riskless dollar rate rd- Thus discounted Yj is not a Q-martingale. Instead, therefore. 



Interest Rate Modelling 



211 



consider the process Yt = Yte^-^^. (i.e. all dividends = interest reinvested in the dollar 
money market account). Show that 

Ct = Yt¥.^[{St-K'y\Tt\ 

where K' = e~^''-'^ Sq/Yq and Q is the equivalent martingale measure associated with 
Y. 

3.8 Find the Q-dynamics of St (with a two-dimensional standard Brownian motion). 

3.9 Convert this to St-dynamics with a one-dimensional Brownian motion. 
3.10 Hence show that 

CtYt[StN{d+) - K'N{d^)] 

where 



d± = 



Inf ±i.52(r-t) 



and 5 = \\as — cy|| = + (5y + 2p6sSY- 

3.11 Finally show that this coincides with the formula obtained earlier. 

In this case, you see that it is slightly easier to value the option in dollars than it is in 
pounds. 

4. Suppose the bond price dynamics are given by 

dp{t, T) = p{t, T)M{t, T) dt + p{t, T)v{t, T) dWt 
Show that in that case the forward rate dynamics are given by 

df{t, T) = a{t, T) dt + a{t, T) dWt 

where 

a{t, T) = VT{t, T)v{t, T) - mrit, T) a{t, T) = -vrit, T) 

[Hint: Apply Ito's formula to lnp(t,T), write this in integrated form, and differentiate 
with respect to T] 

5. Let {y(0, T) : T > 0} denote the zero-coupon yield curve at t = 0. Assume that, apart 
from the zero coupon bonds, we also have exactly one fixed coupon bond for every maturity 

T. enote the yield-to-maturity of the fixed coupon bond by yM{0,T). We now have 3 
curves to consider, the forward rate curve f{0,T), the zero yield curve y{0,T) and the 
coupon yield curve yM{0,T)- 

5.1 Show that /(O, T) = y(0, T) + r^^|^ 

5.2 Assume that the zero yield curve is an increasing function of T. Show that in that 
case 

yM(0,T) <y(0,r) </(0,T) 

for all T. Show that the inequalities are reversed if the zero yield curve is decreasing. 
Explain this phenomenon in terms of simple economics. 



212 



Exercises 



5.3 Yield curves can be both upward and downward sloping. Can this be true for bond 
price curves p{0,T)7 

6. In the Cox-IngersoU-Ross model, the risk-neutral short rate dynamics assumed are 

drt = {b— art) dt + cry^ dWt, a, h, cr, ro > 

6.1 Explain (heuristically) why this process is mean-reverting and non-negative. 

6.2 This is an affine short rate model. By plugging p{t, T) = e"^(*''^)~-^(*''^)'"* into the term 
structure PDE, show that we obtain two coupled ODE's 

Bt = aB + ^(T^S^ - 1 B{T, T) = 
At = hB A{T, T) = 

6.3 To solve the Riccati equation for B, try a solution of the form 

m 



B{t,T) 



cXit) + d 



Choose c to ensure that a + — (? = Q. Show that we then obtain a order linear 
differential equation 

Xt + kX = —d where k = —a + 2c = a/o^ + 2a^ 

6.4 Solve the ODE to obtain 

K 

6.5 Hence show that 

B{t, T) = where k = Va^ + 2^ 

^ ' 2k+ a + K) e«(^-*) - 1) 



6.6 Verify by differentiation that 



2b 

A{t, T) = ^ In 



2AC + (a + Ac)(e«(^-*) - 1) 



7. 7.1 Show that the Hull- White model dr = {6{t) — ar) dt + a dWt is obtained if one starts 
with a HJM model given by 

df{t, T) = a{t, T) dt + (7e-"(^-*) dWt 

Hence compute the function 0{t) which will make the short rate model fit the initial 
term structure: 

e{t) = fHo,t) + arm + - e"'"*] 

where {/*(0,T) : T > 0} is the observed term structure of forward rates. It follows 
that the Hull-White model can also be fitted to any initial term structure. What is 
the distribution of the forward rate f{t, T)l 



Interest Rate Modelling 



213 



7.2 Show that bond prices in the Hull-White model, fitted to the initial term structure, 
are given by 

Pit, T) = exp f /(O, t)Bit, T) - ^B\t, T)(l - e-^"*) - Bit, T)n 



where Bit,T) = \[1 - e-<^-% 

[Hint: The Hull- White model is an affine term structure model, i.e. pit, T) = eMt,T)-B(t,T)rt _ 
Bit,T) is readily calculated. We can now find 

Ait,T)= [ -eiu)Biu,T) du+la^B^iO,t) 

where is as in (a), i.e. bin) = e~""^x('u)e"", where = e~"*ro + Jq e~"(*~") du. 
Integrating by parts leads to 

^(i,r) = /(o,t)i?(t,r) + in^^°'^^ 



p(0,i) 



1-T 
u I 

+ 



2 



h^iu, T) - B^iO, u) du + B^iO, t)Bit, T) 



which simplifies to give the required result.] 

7.3 Show that the Hull- White price of a call C with strike K and maturity T on a bond 
p(0, S) (where > T) is given by 

p(0, S)Nid+) - KpiO, T)Nid-) 

where 

^ _ Kp{0,T) ^ 2"av-^ 

aavVr 

and where 



= [ ^(e-"^ - e-'^ye''^' dt = -^il - e-"(^-^))2(l - e" 
Jo ^ 



2aT- 



8. Consider the domestic and the foreign bond market, with bond prices denoted by pdit,T) 
and pfit,T) respectively. Take as given a standard HJM model for the domestic forward 
rates fdit,T) 

dfdit, T) = adit, T) dt + adit, T) dWt 

where Wt is a multidimensional Brownian motion under the domestic martingale measure 
Q. The foreign forward rates are denoted by ffit,T), and their Q-dynamics are given by 

dffit,T) = afit,T)dt + afit,T) dWt 

Note that the same Brownian motion drives both bond markets. The exchange rate X (in 
units of domestic currency per unit of foreign currency) has Q-dynamics 

dXt = nit)Xit) dt + axit)Xit) dWt 



214 



Exercises 



Show that under the domestic martingale measure the foreign forward rates satisfy the 
modified HJM drift condition 



faf{t,s) ds-a'^{t) 



9. A common implementation of the HJM framework uses the following forward rate dynam- 
ics: 

df{t,T) = a(i,r)di + (cJi,(J2e-t(^-*)) . (dWi{t),dW2{t)) 

where (Ti, (72, A are non-negative constants, Wi, W2 are independent Q-Brownian motions, 
and Q is the equivalent risk-neutral measure. 

This is a two-factor model. The first factor Wi{t) can be interpreted as a source of noise 
that lasts a long time, affecting all maturities equally. The second factor W2{t) affects 
short maturity forward rates more than the long term rates (why?), and thus adds some 
extra volatility to the short term rates. 

9.1 Show that the HJM drift conditions imply that 

a{t,T) = al{T - t) - ?^e-tM(e-t(^-*) - 1) 

A 

9.2 Hence show that 

fit, T) =/(0, T) + alt{T - t/2) - 2(a2/A)2[e-^^(e^* - 1) - 2e-(^/2)^(e(^/2)t _ 

+ (TiWi{t) + (72 / e-(^/2)(T-«) ^^2(n) 
Jo 

9.3 Show that the spot rate follows the process 

r{t) = f{0,t) + laft' - 2((72/A)^[l - e-(V2)*]2 

+ aiWl{t) + (726 

-(\/2)t / eW'^)^ dW2(u) 
Jo 

9.4 Is the short rate a Markov process, a Gaussian process, a stationary process? Explain. 

9.5 Calculate the price C{t) of a call option on the zero coupon bond p(t, T). Assume that 
the option has strike K and expiry r, where t < t <T. 

[Hint: Let p{t, r) be the numeraire. You know the HJM dynamics of zero coupon 
bonds under Q, so the dynamics of p{t,T)/p{t,T) under the EMM for p{t,T) should 
be easy to find. Of course, something is going to be lognormal. Now use the general 
option pricing formula.] 

9.6 As a check, assume that ai = 0.2, a2 = 0.3 and A = 2. Calculate the value of a call 
option on a two-year zero coupon bond with strike 0.9 and expiry 1 year. Today's 
prices are P(0, 1) = 0.9, P(0, 2) = 0.81. I get 0.076 (but I could be wrong, of course). 

10. Consider a convertible bond X which, at Tq, allows the owner to convert the bond to c 
shares S of common stock. The bond is a zero coupon bond with face value 1.00 and 
maturity Ti > Tq. The aim of this problem is to find the price the convertible bond at 



Interest Rate Modelling 



215 



some future date t < Tq. Wc will model the short rate using Ho-Lee dynamics. Initially, 
the (instantaneous) forward rate curve is flat with /(0,T) = ro for all maturities T. 

We work under a risk-neutral measure Q where the share has dynamics 

dSt = r{t)St dt + asSt dWt 
and the short rate has dynamics 

dr{t) = e{t) dt + Or dWt 
Here Wt is a two-dimensional Q-Brownian motion, and ag, Gr are constant vectors. 

10.1 Let p{t, T) be a non-convertible zero coupon bond with face value 1.00 and maturity 
T years. Calculate the observed term structure of bond prices {]5*(0, T) : T > 0}. 

10.2 Let Q be the forward risk-neutral measure for maturity Ti years (i.e. the EMM for 
numeraire p{t,Ti)). By decomposing the convertible bond into its option and bond 
parts, show that 

Xo = cpiO,T,)E^ [{Sto - IV] +p(0,Ti) 

10.3 The Ho-Lee model is an affine term structure model, i.e. bond prices are of the form 

i?(t,r) = e^(*'^)-^(*'^>W 
By substituting this expression into the term structure PDE, show that 

B{t,T) = T-t A{t,T)= f e{u){u-T) du+];(Tl{T -tf 

Jt 6 

10.4 Show that 

^^^ = rit)dt-ariT-t) dWt 

10.5 Fit the Ho-Lee model to the initial term structure of forward rates: Show that 

e{t) = alt 

10.6 Hence show that ^ 

r{t) = ro + -alt^ + a^Wt 

10.7 Hence, using the known initial bond prices and short rate dynamics, show that future 
bond prices are given by 

p{t,T) = e-i^'*(T'-t)^-r-(t)(T-t) 

What is the distribution of p{t, T) under Q? 

10.8 Show that St has dynamics 

'^ = {cTS + CJr{T,-t)) dWt 

under the measure Q, where Wt is a Q-Brownian motion. 
What is the distribution of St under Q? 



216 



Exercises 



10.9 Deduce that 
where 

d± 



Xo = cSoN{d+)-p{0,Ti)N{d.) +p{0,Ti] 



S2 = ^ / ' Was + ar{Ti - s)\\^ ds 
-to Jo 



11. The aim of this problem is to calculate the price of an in-arrears caplet in the Ho-Lee 
model, where the short rate has riskneutral dynamics 

drt = 9{t) dt + a dWt 

Here, a is a constant, and Wt is a 1-dimensional Brownian motion under the risk neutral 
measure Q. The caplet has payoff 

0.5max{L- i?c,0} 

at expiry = 1 year, where L is the 6-month spot LIBOR rate in 6 months' time, and Rc 
is the cap rate. Use the following data: 



p(o,r) 






10% 


Rc 


12% 


a 


10% 



Here P(0, T) is the default-free zero coupon bond with face value 1 and maturity T. 



We proceed as follows: We first show that the caplet is equivalent to a portfolio of put 

options on zero coupon bonds. Then we recast the Ho-Lee model within the HJM frame- 
work in order to fit it to the observed (flat) term structure, and calculate the prices of zero 
coupon bonds. Finally, we calculate the prices of vanilla options on zero coupon bonds. 

11.1 First show that a caplet can be regarded as a portfolio of 6-month put options on 
the 1-year zero: 

Caplet =(1 -I- RcAT) put options on P{t, T2) with strike 



1 + RcAT 
and maturity Ti 

where Ti = 0.5, AT = 0.5, and T2 = Ti + AT = 1. 

11.2 The Ho-Lee model is an affine short rate model, with bond prices of the form 
P(t, T) = e"^(*'-^)~-^(*'-^)''* . By substituting this form of P{t, T) into the term structure 
PDE, show that 

B{t,T) = T-t 
rT 



A{t, T) = - f e{u){T -u)du+ la'^{T - tf 
Jt 6 



Interest Rate Modelling 



217 



11.3 In order to fit the short rate model to the observed term structure, we recast it in 
the HJM framework. You may use the facts about the HJM model which are stated 
on the formula sheet. 

Using the relation between forward rates and zero coupon bond prices and the value 
of B(t, T), show that the instantaneous forward rate f{t, T) has a constant "volatility" 
a, i.e. that the forward rate dynamics are 

df{t, T) = a{t, T)dt + a dWt 

for some function a{t,T). 

11.4 Use the HJM drift conditions to show that a{t, T) = a'^{T - t) 

11.5 Hence show that 

rt = ro + \aH'^ + aWt 

and conclude that 

drt = aH dt + a dWt 

11.6 Next, show that 

A{t,T) = -^aHiT-tf 
and thus that zero coupon bond prices are given by 

P{t,T) = (>-\'^^KT-t?-(T-t)rt 

11.7 Now that bond prices have been found, we will price bond options. Recall the general 
option formula stated in the Formula Sheet. Change the numeraire to the 6-month 
zero coupon bond P(0,Ti). Let Pt = ^p^^- Write down the dynamics of Pj under 

Ti -forward neutral measure Qi. 

11.8 Show that Px^ is lognormally distributed under Qi, and find its distribution. 

11.9 Similarly, find the dynamics of = i under the T2-forward measure Q2- Show that 

is lognormally distributed under Q2- 

11.10 Deduce the following formula for a call option on P(t, T2) with strike K and maturity 

Co = P{0,T2)N{d+) - KP(0,ri)iV(d_) 
Write down expressions for d±. 

11.11 Use put-call parity and the table of the normal distribution to find the price of the 
original caplet. 



